spiderwebai-py 0.1.4

Creator: bradpython12

Last updated:

Add to Cart

Description:

spiderwebaipy 0.1.4

SpiderWebAI Python SDK
The SpiderWebAI Python SDK offers a toolkit for straightforward website scraping, crawling at scale, and other utilities like extracting links and taking screenshots, enabling you to collect data formatted for compatibility with language models (LLMs). It features a user-friendly interface for seamless integration with the SpiderWebAI API.
Installation
To install the SpiderWebAI Python SDK, you can use pip:
pip install spiderwebai-py

Usage

Get an API key from spiderwebai.xyz
Set the API key as an environment variable named SPIDER_API_KEY or pass it as a parameter to the SpiderWebAIApp class.

Here's an example of how to use the SDK:
from spiderwebai import SpiderWebAIApp

# Initialize the SpiderWebAIApp with your API key
app = SpiderWebAIApp(api_key='your_api_key')

# Scrape a single URL
url = 'https://spiderwebai.xyz'
scraped_data = app.scrape_url(url)

# Crawl a website
crawler_params = {
'limit': 1,
'proxy_enabled': True,
'store_data': False,
'metadata': False,
'request': 'http'
}
crawl_result = app.crawl_url(url, params=crawler_params)

Scraping a URL
To scrape data from a single URL:
url = 'https://example.com'
scraped_data = app.scrape_url(url)

Crawling a Website
To automate crawling a website:
url = 'https://example.com'
crawl_params = {
'limit': 200,
'request': 'smart_mode'
}
crawl_result = app.crawl_url(url, params=crawl_params)

Retrieving Links from a URL(s)
Extract all links from a specified URL:
url = 'https://example.com'
links = app.links(url)

Taking Screenshots of a URL(s)
Capture a screenshot of a given URL:
url = 'https://example.com'
screenshot = app.screenshot(url)

Extracting Contact Information
Extract contact details from a specified URL:
url = 'https://example.com'
contacts = app.extract_contacts(url)

Labeling Data from a URL(s)
Label the data extracted from a particular URL:
url = 'https://example.com'
labeled_data = app.label(url)

Checking Available Credits
You can check the remaining credits on your account:
credits = app.get_credits()

Streaming
If you need to stream the request use the third param:
url = 'https://example.com'

crawler_params = {
'limit': 1,
'proxy_enabled': True,
'store_data': False,
'metadata': False,
'request': 'http'
}

links = app.links(url, crawler_params, True)

Content-Type
The following Content-type headers are supported using the fourth param:

application/json
text/csv
application/xml
application/jsonl

url = 'https://example.com'

crawler_params = {
'limit': 1,
'proxy_enabled': True,
'store_data': False,
'metadata': False,
'request': 'http'
}

# stream json lines back to the client
links = app.crawl(url, crawler_params, True, "application/jsonl")

Error Handling
The SDK handles errors returned by the SpiderWebAI API and raises appropriate exceptions. If an error occurs during a request, an exception will be raised with a descriptive error message.
Contributing
Contributions to the SpiderWebAI Python SDK are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request on the GitHub repository.
License
The SpiderWebAI Python SDK is open-source and released under the MIT License.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.