pyxpdf 0.2.3

Creator: railscoder56

Last updated:

Add to Cart

Description:

pyxpdf 0.2.3

pyxpdf is a fast and memory efficient python module for parsing PDF documents based on xpdf reader sources.


docs


tests


package


license





Features

Almost x20 times faster than pure python based pdf parsers (see Speed Comparison)
Extract text while maintaining original document layout (best possible)
Support almost all PDF encodings, CMaps and predefined CMaps.
Extract LZW, RLE, CCITTFax, DCT, JBIG2 and JPX compressed images and image masks along with their BBox.
Render PDF Pages as image with support of ‘1’, ‘L’, ‘LA’, ‘RGB’, ‘RGBA’ and ‘CMYK’ color modes.
No explict dependencies (except optional ones, see Installation)
Thread Safe



More Information

Documentation

Installation
Quickstart


Contribute

Build
Issues
Pull requests


Speed Comparison
Changelog



License
pyxpdf is licensed under the GNU General Public License (GPL), version 3. See the LICENSE


Credits

xpdf reader by Derek Noonburg
lxml - project structure and build adapted from lxml
poppler project

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.