pdfdata 0.1.3.2

Creator: railscoder56

Last updated:

Add to Cart

Description:

pdfdata 0.1.3.2

{pdfdata}
Python package for extracting text and data from PDFs.
Installation
pip install pdfdata

Usage
from pdfdata import *
from pprint import pprint


# parse pdf as dictionary
pdf_parsed = parse_pdf('pdfs/0641-20.pdf')
res = pdf_doc_extract_span_list(pdf_parsed)

pprint(res, depth=3)



# parse pdf as list of spans
pdf_parsed = parse_pdf('pdfs/0641-20.pdf')
res = pdf_doc_extract_span_df(pdf_parsed)

pprint(res[0])




# transform pdf text to jsonnl
pdf_text_to_jsonnl('pdfs/0641-20.pdf', '0641-20.jsonnl')

DevNotes
build
python -m build

pypi test upload
python -m twine upload --repository testpypi dist/* --skip-existing

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.