TensorFlowASR 2.1.0

Creator: bradpython12

Last updated:

0 purchases

TODO
Add to Cart

Description:

TensorFlowASR 2.1.0

TensorFlowASR :zap:












Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2


TensorFlowASR implements some automatic speech recognition architectures such as DeepSpeech2, Jasper, RNN Transducer, ContextNet, Conformer, etc. These models can be converted to TFLite to reduce memory and computation for deployment :smile:

What's New?
Table of Contents


What's New?
Table of Contents
:yum: Supported Models

Baselines
Publications


Installation

Installing from source (recommended)
Installing via PyPi
Installing for development
Install for Apple Sillicon
Running in a container


Training & Testing Tutorial
Features Extraction
Augmentations
TFLite Convertion
Pretrained Models
Corpus Sources

English
Vietnamese


How to contribute
References & Credits
Contact


:yum: Supported Models
Baselines

Transducer Models (End2end models using RNNT Loss for training, currently supported Conformer, ContextNet, Streaming Transducer)
CTCModel (End2end models using CTC Loss for training, currently supported DeepSpeech2, Jasper)

Publications

Conformer Transducer (Reference: https://arxiv.org/abs/2005.08100)
See examples/models/transducer/conformer
ContextNet (Reference: http://arxiv.org/abs/2005.03191)
See examples/models/transducer/contextnet
RNN Transducer (Reference: https://arxiv.org/abs/1811.06621)
See examples/models/transducer/rnnt
Deep Speech 2 (Reference: https://arxiv.org/abs/1512.02595)
See examples/models/ctc/deepspeech2
Jasper (Reference: https://arxiv.org/abs/1904.03288)
See examples/models/ctc/jasper

Installation
For training and testing, you should use git clone for installing necessary packages from other authors (ctc_decoders, rnnt_loss, etc.)
Installing from source (recommended)
git clone https://github.com/TensorSpeech/TensorFlowASR.git
cd TensorFlowASR
# Tensorflow 2.x (with 2.x.x >= 2.5.1)
pip3 install ".[tf2.x]" # or ".[tf2.x-gpu]"

For anaconda3:
conda create -y -n tfasr tensorflow-gpu python=3.8 # tensorflow if using CPU, this makes sure conda install all dependencies for tensorflow
conda activate tfasr
pip install -U tensorflow-gpu # upgrade to latest version of tensorflow
git clone https://github.com/TensorSpeech/TensorFlowASR.git
cd TensorFlowASR
# Tensorflow 2.x (with 2.x.x >= 2.5.1)
pip3 install ".[tf2.x]" # or ".[tf2.x-gpu]"

Installing via PyPi
# Tensorflow 2.x (with 2.x >= 2.3)
pip3 install "TensorFlowASR[tf2.x]" # or pip3 install "TensorFlowASR[tf2.x-gpu]"

Installing for development
git clone https://github.com/TensorSpeech/TensorFlowASR.git
cd TensorFlowASR
pip3 install -e ".[dev]"
pip3 install -e ".[tf2.x]" # or ".[tf2.x-gpu]" or ".[tf2.x-apple]" for apple m1 machine

Install for Apple Sillicon
Due to tensorflow-text is not built for Apple Sillicon, we need to install it with the prebuilt wheel file from sun1638650145/Libraries-and-Extensions-for-TensorFlow-for-Apple-Silicon
git clone https://github.com/TensorSpeech/TensorFlowASR.git
cd TensorFlowASR
pip3 install -e "." # or pip3 install -e ".[dev] for development # or pip3 install "TensorFlowASR[dev]" from PyPi
pip3 install tensorflow~=2.14.0 # change minor version if you want

Do this after installing TensorFlowASR with tensorflow above
TF_VERSION="$(python3 -c 'import tensorflow; print(tensorflow.__version__)')" && \
TF_VERSION_MAJOR="$(echo $TF_VERSION | cut -d'.' -f1,2)" && \
PY_VERSION="$(python3 -c 'import platform; major, minor, patch = platform.python_version_tuple(); print(f"{major}{minor}");')" && \
URL="https://github.com/sun1638650145/Libraries-and-Extensions-for-TensorFlow-for-Apple-Silicon" && \
pip3 install "${URL}/releases/download/v${TF_VERSION_MAJOR}/tensorflow_text-${TF_VERSION_MAJOR}.0-cp${PY_VERSION}-cp${PY_VERSION}-macosx_11_0_arm64.whl"

Running in a container
docker-compose up -d

Training & Testing Tutorial

For training, please read tutorial_training
For testing, please read tutorial_testing

FYI: Keras builtin training uses infinite dataset, which avoids the potential last partial batch.
See examples for some predefined ASR models and results
Features Extraction
See features_extraction
Augmentations
See augmentations
TFLite Convertion
After converting to tflite, the tflite model is like a function that transforms directly from an audio signal to text and tokens
See tflite_convertion
Pretrained Models
Go to drive
Corpus Sources
English



Name
Source
Hours




LibriSpeech
LibriSpeech
970h


Common Voice
https://commonvoice.mozilla.org
1932h



Vietnamese



Name
Source
Hours




Vivos
https://ailab.hcmus.edu.vn/vivos
15h


InfoRe Technology 1
InfoRe1 (passwd: BroughtToYouByInfoRe)
25h


InfoRe Technology 2 (used in VLSP2019)
InfoRe2 (passwd: BroughtToYouByInfoRe)
415h



How to contribute

Fork the project
Install for development
Create a branch
Make a pull request to this repo

References & Credits

NVIDIA OpenSeq2Seq Toolkit
https://github.com/noahchalifour/warp-transducer
Sequence Transduction with Recurrent Neural Network
End-to-End Speech Processing Toolkit in PyTorch
https://github.com/iankur/ContextNet

Contact
Huy Le Nguyen
Email: nlhuy.cs.16@gmail.com

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Files In This Product:

Customer Reviews

There are no reviews.