Last updated:
0 purchases
perke 0.4.4
Perke
Perke is a Python keyphrase extraction package for Persian language. It
provides an end-to-end keyphrase extraction pipeline in which each component
can be easily modified or extended to develop new models.
Installation
The easiest way to install is from PyPI:
pip install perke
Alternatively, you can install directly from GitHub:
pip install git+https://github.com/alirezatheh/perke.git
Perke also requires a trained POS tagger model. We use
Hazm's POS tagger model. You can
easily download latest Hazm's POS
tagger using the following command:
python -m perke download
Alternatively, you can use another model with same tag names and structure,
and put it in the
resources
directory.
Simple Example
Perke provides a standardized API for extracting keyphrases from a text. Start
by typing the 4 lines below to use TextRank keyphrase extractor.
from perke.unsupervised.graph_based import TextRank
# 1. Create a TextRank extractor.
extractor = TextRank()
# 2. Load the text.
extractor.load_text(input='text or path/to/input_file')
# 3. Build the graph representation of the text and weight the
# words. Keyphrase candidates are composed of the 33 percent
# highest weighted words.
extractor.weight_candidates(top_t_percent=0.33)
# 4. Get the 10 highest weighted candidates as keyphrases.
keyphrases = extractor.get_n_best(n=10)
For more in depth examples see the
examples
directory.
Documentation
Documentation and references are available at
Read The Docs.
Implemented Models
Perke currently, implements the following keyphrase extraction models:
Unsupervised models
Graph-based models
TextRank: article
by Mihalcea and Tarau, 2004
SingleRank: article
by Wan and Xiao, 2008
TopicRank: article
by Bougouin, Boudin and Daille, 2013
PositionRank: article
by Florescu and Caragea, 2017
MultipartiteRank: article
by Boudin, 2018
Acknowledgements
Perke is inspired by pke.
For personal and professional use. You cannot resell or redistribute these repositories in their original state.
There are no reviews.