Last updated:
0 purchases
Augmentext 0.1.7
Augmentext
Augmentext is a text augmentation package for Natural Language Processing, with a focus on applications in the biomedical domain.
Augmentext is work in progress! Some features are functional, but it not yet in a usable state.
Features
Auto-generated, randomised misspellings
Dictionary-based thesaurus word replacement
Auto-generated abbreviations
More to come...
Biomedical Domain Specific Features
Although a general library, Augmentext has a special focus on biomedical text, such as
Replacement of mm/g^2 with common mistakes, e.g. g/mm^2 etc.
Conversion of units from metric to imperial/customary and vice versa
Integration of SNOMED, ICD, MeSH, RxNorm and other text corpora in to the augmentation pipeline
Synonym replacement using pre-trained models using GloVe, fasttext, and word2vec.
More Information
See the project's GitHub respository https://github.com/mdbloice/Augmentext
Help will be available here once the software has been made public on GitHub: https://augmentext.readthedocs.io
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.