jwm.robotstxt 1.0.8

Creator: rpa-with-ash

Last updated:

Add to Cart

Description:

jwm.robotstxt 1.0.8

jwm.robotstxt
Python Wrapper for Googles Robotstxt Parser
Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.
Websites may provide an optional robots.txt file in their domains root to govern the access and behavior of web scrapers. One of the most famous webscrapers GoogleBot is responsible for promoting this standard and sites interested in SEO will closely conform to GoogleBot behavior.
All credit for the parser goes to the Google team who created, open sourced and promoted it.

SEO (Search Engine Optimization): The process of modifying a websites content or metadata to boost rankings in search engines page indexes. Higher rankings lead to higher positions in user searches leading to more visitors. For further details, see the SEO wikipedia page.

Usage
Basic usage using the RobotsMatcher class provided by Google.
import jwm.robotstxt.googlebot

robotstxt = """
user-agent: GoodBot
allowed: /path
"""

matcher = jwm.robotstxt.googlebot.RobotsMatcher()
assert matcher.AllowedByRobots(robotstxt, ("GoodBot",), "/path")

Check out the documentation for further details. For more use cases, see the test cases for jwm.robotstxt and robotstxt.
Installation
Install from Pypi under the jwm.robotstxt distribution.
pip install jwm.robotstxt

Import into your program through the jwm.robotstxt.googlebot package.
import jwm.robotstxt.googlebot

Virtual Environment
It is highly recommended to install python projects into a virtual environment, see PEP405 for motivations.
Create a virtual environment in the .venv directory.
python3 -m venv ./.venv

Activate with the correct command for your system.
# Linux/MacOS
. ./.venv/bin/activate

# Windows
.\.venv\Scripts\activate

Installing from source
Make sure you have cloned the repository and its submodules.
git clone --recurse-submodules https://github.com/jwmorley73/jwm.robotstxt.git

Install the project using pip. This will build the required robotstxt static library files and link them into the produced python package.
pip install .

If you want to include the developer tooling, add the dev optional dependencies.
pip install .[dev]

Known Issues

Windows 32 bit is not supported.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.