py621dl 0.1a1.dev0

Creator: railscoderz

Last updated:

Add to Cart

Description:

py621dl 0.1a1.dev0

py621dl - an iterable E621 downloader
This package is meant to be used in deep learning applications and automation,
not as a means to download specific images and post IDs or searching for tags.
For that application, please check out py621
which is not related to this package in any way.
The package is meant to be used with the official db export format from E621,
posts information. See here for available db exports
and here for general information on the API.
!! This is a pre-release version, and is not meant for production use !!
Proper documentation, tests, and automated updates to the package will be added later.
Installation
You can install the package using pip install py621dl on python>=3.11
Usage
The E621Downloader class must be initialized using the Reader class, to which
the csv file must be passed. The Reader supports only the official db export csv files
of the format "posts-YYYY-MM-DD.csv.gz", either compressed or uncompressed.
The E621Downloader
class can be initialized with the following parameters:

csv_reader:
the Reader
object
timeout: the timeout for the requests, in seconds
retries: the number of retries for the requests

It can be used as an iterable, yielding lists of np.ndarray objects of the images. The list size
will depend on your batch_size specified for Reader. The images are of opencv BGR format.
The downloader automatically handles and filters deleted or flagged posts, and will attempt to fill
the batch with new images so that it will always yield a full batch.
The Reader
class can be initialized with the following parameters:

csv_file: the path to the csv file
batch_size: the size of the batch to be returned by the E621Downloader
excluded_tags: a list of E621 tags to be excluded from the results
minimum_score: the minimum score of the posts to be included in the results
chunk_size: the size of the chunk to be read from the csv file at once
checkpoint_file: the path to the checkpoint file, to resume from any point. If path doesn't exist, a new file will
be created.
repeat: whether to repeat from the beginning of the csv file when the end is reached automatically.
Otherwise StopIteration is raised.
E621Downloader handles this exception and raises its own StopIteration when the end is reached.

Example use
from py621dl import Reader, E621Downloader

reader = Reader("posts-2022-10-30.csv.gz")
downloader = E621Downloader(reader, timeout=10, retries=3)

for batch in downloader:
# do something with the batch
pass

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.