Sherpa Onnx | GitLocker.com Product

Description:

sherpa onnx

Supported functions #

Speech recognition
Speech synthesis
Speaker verification
Speaker identification

✔️
✔️
✔️
✔️

Spoken Language identification
Audio tagging
Voice activity detection

✔️
✔️
✔️

Keyword spotting
Add punctuation

✔️
✔️

Supported platforms #

Architecture
Android
iOS
Windows
macOS
linux

x64
✔️

✔️
✔️
✔️

x86
✔️

✔️

arm64
✔️
✔️
✔️
✔️
✔️

arm32
✔️

✔️

riscv64

✔️

Supported programming languages #

1. C++
2. C
3. Python
4. JavaScript

✔️
✔️
✔️
✔️

5. Java
6. C#
7. Kotlin
8. Swift

✔️
✔️
✔️
✔️

9. Go
10. Dart
11. Rust
12. Pascal

✔️
✔️
✔️
✔️

For Rust support, please see sherpa-rs
It also supports WebAssembly.
Introduction #
This repository supports running the following functions locally

Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
Text-to-speech (i.e., TTS)
Speaker identification
Speaker verification
Spoken language identification
Audio tagging
VAD (e.g., silero-vad)
Keyword spotting

on the following platforms and operating systems:

x86, x86_64, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)
Linux, macOS, Windows, openKylin
Android, WearOS
iOS
NodeJS
WebAssembly
Raspberry Pi
RV1126
LicheePi4A
VisionFive 2
旭日X3派
爱芯派
etc

with the following APIs

C++, C, Python, Go, C#
Java, Kotlin, JavaScript
Swift, Rust
Dart, Object Pascal

Links for Huggingface Spaces #
You can visit the following Huggingface spaces to try sherpa-onnx without
installing anything. All you need is a browser.

Description
URL

Speech recognition
Click me

Speech recognition with Whisper
Click me

Speech synthesis
Click me

Generate subtitles
Click me

Audio tagging
Click me

Spoken language identification with Whisper
Click me

We also have spaces built using WebAssembly. The are listed below:

Description
Huggingface space
ModelScope space

Voice activity detection with silero-vad
Click me
地址

Real-time speech recognition (Chinese + English) with Zipformer
Click me
地址

Real-time speech recognition (Chinese + English) with Paraformer
Click me
地址

Real-time speech recognition (Chinese + English + Cantonese) with Paraformer-large
Click me
地址

Real-time speech recognition (English)
Click me
地址

VAD + speech recognition (Chinese + English + Korean + Japanese + Cantonese) with SenseVoice
Click me
地址

VAD + speech recognition (English) with Whisper tiny.en
Click me
地址

VAD + speech recognition (English) with Zipformer trained with GigaSpeech
Click me
地址

VAD + speech recognition (Chinese) with Zipformer trained with WenetSpeech
Click me
地址

VAD + speech recognition (Japanese) with Zipformer trained with ReazonSpeech
Click me
地址

VAD + speech recognition (Thai) with Zipformer trained with GigaSpeech2
Click me
地址

VAD + speech recognition (Chinese 多种方言) with a TeleSpeech-ASR CTC model
Click me
地址

VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-large
Click me
地址

VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-small
Click me
地址

Speech synthesis (English)
Click me
地址

Speech synthesis (German)
Click me
地址

Links for pre-built Android APKs #

Description
URL
中国用户

Streaming speech recognition
Address
点此

Text-to-speech
Address
点此

Voice activity detection (VAD)
Address
点此

VAD + non-streaming speech recognition
Address
点此

Two-pass speech recognition
Address
点此

Audio tagging
Address
点此

Audio tagging (WearOS)
Address
点此

Speaker identification
Address
点此

Spoken language identification
Address
点此

Keyword spotting
Address
点此

Links for pre-built Flutter APPs #
Real-time speech recognition

Description
URL
中国用户

Streaming speech recognition
Address
点此

Text-to-speech

Description
URL
中国用户

Android (arm64-v8a, armeabi-v7a, x86_64)
Address
点此

Linux (x64)
Address
点此

macOS (x64)
Address
点此

macOS (arm64)
Address
点此

Windows (x64)
Address
点此

Note: You need to build from source for iOS.

Links for pre-built Lazarus APPs #
Generating subtitles

Description
URL
中国用户

Generate subtitles (生成字幕)
Address
点此

Links for pre-trained models #

Description
URL

Speech recognition (speech to text, ASR)
Address

Text-to-speech (TTS)
Address

VAD
Address

Keyword spotting
Address

Audio tagging
Address

Speaker identification (Speaker ID)
Address

Spoken language identification (Language ID)
See multi-lingual Whisper ASR models from Speech recognition

Punctuation
Address

Useful links #

Documentation: https://k2-fsa.github.io/sherpa/onnx/
Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi

How to reach us #
Please see
https://k2-fsa.github.io/sherpa/social-groups.html
for 新一代 Kaldi 微信交流群 and QQ 交流群.

Overview

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

You're allowed to use the code bits in the repositories in unlimited projects.
Attribution is not required to use the code bits.

What you can do with it

Use them freely in your personal and professional work.

What you can't do with it

Don't be greedy. Selling or distributing these repositories in their original state is prohibited.

sherpa_onnx

Languages

Categories

Description:

License:

Share

Overview

What you can do with it

What you can't do with it

Related Products

cupertino_icons

shared_preferences

intl

url_launcher

image_picker

More From This Creator

flutter_exts

desktop_info

structured_data

simplest

airex_flutter_plugin

sherpa_onnx

Languages

Categories

Description:

License:

Share

Customer Reviews

License

Overview

What you can do with it

What you can't do with it

Related Products

cupertino_icons

shared_preferences

intl

url_launcher

image_picker

More From This Creator

flutter_exts

desktop_info

structured_data

simplest

airex_flutter_plugin