Last updated:
0 purchases
pyxpdf 0.2.3
pyxpdf is a fast and memory efficient python module for parsing PDF documents based on xpdf reader sources.
docs
tests
package
license
Features
Almost x20 times faster than pure python based pdf parsers (see Speed Comparison)
Extract text while maintaining original document layout (best possible)
Support almost all PDF encodings, CMaps and predefined CMaps.
Extract LZW, RLE, CCITTFax, DCT, JBIG2 and JPX compressed images and image masks along with their BBox.
Render PDF Pages as image with support of ‘1’, ‘L’, ‘LA’, ‘RGB’, ‘RGBA’ and ‘CMYK’ color modes.
No explict dependencies (except optional ones, see Installation)
Thread Safe
More Information
Documentation
Installation
Quickstart
Contribute
Build
Issues
Pull requests
Speed Comparison
Changelog
License
pyxpdf is licensed under the GNU General Public License (GPL), version 3. See the LICENSE
Credits
xpdf reader by Derek Noonburg
lxml - project structure and build adapted from lxml
poppler project
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.