bpc 0.1.3

Last updated:

0 purchases

bpc 0.1.3 Image
bpc 0.1.3 Images
Add to Cart

Description:

bpc 0.1.3

BURMESE PHONEMIZER AND CLEANER(BPC)




Burmese Language data prepartion for speech related tasks.



Installation
$ pip install bpc

or
$ pip install git+git://github.com:1chimaruGin/Burmese_Phomizer_and_Cleaner.git

Usage
For text Cleaning
from bpc import Cleaner

cc = Cleaner()
cc.clean_text("မင်္ဂလာပါ? မင်္ဂလာပါ။ ၀န်းရံ ဝ၁၂၃၄ 5B")

# output: မင်္ဂလာပါ မင်္ဂလာပါ။ ဝန်းရံ ၀၁၂၃၄ ၅ဘီ

For phonemization
from bpc import BurmesePhoneme

bp = BurmesePhonemizer()
bp.text_to_phone("မင်္ဂလာပါ")

# output: ['m', 'ŋ', 'ɡ', 'l', 't', 's', 'p', 'ˈe']

For data preparation
from bpc.dataset import PrepareDataset

dataset = PrepareDataset()
dataset.prepare_data(path='path/to/dataset', method='kfold', save=True)

References

https://github.com/espnet/espnet
https://github.com/bootphon/phonemizer

Citations
@inproceedings{watanabe2018espnet,
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson {Enrique Yalta Soplin} and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
title={{ESPnet}: End-to-End Speech Processing Toolkit},
year={2018},
booktitle={Proceedings of Interspeech},
pages={2207--2211},
doi={10.21437/Interspeech.2018-1456},
url={http://dx.doi.org/10.21437/Interspeech.2018-1456
}

@article{Bernard2021,
doi = {10.21105/joss.03958},
url = {https://doi.org/10.21105/joss.03958},
year = {2021},
publisher = {The Open Journal},
volume = {6},
number = {68},
pages = {3958},
author = {Mathieu Bernard and Hadrien Titeux},
title = {Phonemizer: Text to Phones Transcription for Multiple Languages in Python},
journal = {Journal of Open Source Software}
}

License:

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.