grobid2json 0.0.1

Creator: bradpython12

Last updated:

Add to Cart

Description:

grobid2json 0.0.1

Grobid2Json
Extract the code to parse grobid xml into json from the s2orc-doc2json project and package it as a pypi package.






✨ Features

Process the XML files parsed by Grobid into JSON format.

📦 Installation
pip install grobid2json

🤯 Usage
from bs4 import BeautifulSoup
from grobid2json import convert_xml_to_json

file_path = "test.xml"
with open(file_path, "rb") as f:
xml_data = f.read()
soup = BeautifulSoup(xml_data, "xml")
paper_id = file_path.split("/")[-1].split(".")[0]
paper = convert_xml_to_json(soup, paper_id, "")
json_data = paper.as_json()
print(json_data)

🔗 Links
Credits

s2orc-doc2json - https://github.com/allenai/s2orc-doc2json


📝 License
This project is Apache License 2.0 licensed.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.