substream 0.1.1

Creator: bradpython12

Last updated: September 25, 2024

0 purchases

Free

Donate

Languages

Python

Description:

substream 0.1.1

Substream
Transcribes an audio file to .srt subtitle format using word timings from
Google's Speech-to-Text API.
Requirements:

A Google account, signed up for cloud.

Installing:
pip install substream

Cloud setup:

Create a new service account
for a new project dedicated to your recognition job. It must have the
following permissions:

Cloud Speech Service Agent
Storage Admin OR
Storage Object Viewer if supplying a gs:// URI to the script.

You can set the location to the .json credentials file you downloaded in the
current environment like:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/cloud_credentials.json

(OR) you can set it just before the substream command like:
GOOGLE_APPLICATION_CREDENTIALS=/path/to/cloud_credentials.json substream ...

On run, a temporary bucket will be created, the file uploaded, and
on completion or error, a context manager
ensures bucket deletion.
Please be careful with these credentials as cloud resources can be expensive,
so make to store them securely if you do store them at all, and make sure all
project buckets are deleted manually even if the app reports they have been
successfully deleted.

Full Usage:
usage: substream [-h] -i INPUT -o SRT_FILENAME [--language CODE] [-v]

Transcribes an audio file or .jsonl dump to .srt using the Google Cloud
Speech-to-Text API

optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
mono audio file (flac, opus, 16 bit pcm) (or) gs://
uri to audio file (or) intermediate .jsonl dump
(default: None)
-o SRT_FILENAME, --output SRT_FILENAME
.srt filename (default: None)
--language CODE https://cloud.google.com/speech-to-text/docs/languages
(default: en-US)
-v, --verbose extra logging (default: False)

Sample Usage with a local file:
substream -v -i test.flac -o test.srt --language en-US

Sample usage with a URI:
substream -v -i gs://my-bucket/test.flac -o test.srt

Uninstalling:
pip uninstall substream

FAQ

Why the long-running API rather than the streaming API?
The long running API is more accurate.

What is the .jsonl file?
Each stripped line in the file is a string containing a json representation
of a word with it's start and end timings. Later versions of this program
may accept the .jsonl file to format the sentences in a better way without
having to re-run the audio transcription.

Known Issues:

'walls of text' caused by people speaking without interruption. Some
subtitles may have to be manually split using a .srt editor.

Speaker identification is currently broken in the long running
api for long files, so splitting on this is curently disabled.
(this exacerbates the above point)

Progress report is unimplemented by the long running API currently.

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Files In This Product:

There are no reviews.

zed

substream 0.1.1

Languages

Categories

Description:

License

Share

Files In This Product:

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

More From This Creator

xdict 1.1.11

xdisplayselect 1.0.0

xfcs 1.1.6

xfcsdashboard 0.0.2

xfds 0.3.0

substream 0.1.1

Languages

Categories

Description:

License

Share

Files In This Product:

Customer Reviews

License

Overview

What you can do with it

What you can't do with it

Related Products

Views For YouTube Bot writed on Python

AI-Web-Scraper

quivr

roop

zed

More From This Creator

xdict 1.1.11

xdisplayselect 1.0.0

xfcs 1.1.6

xfcsdashboard 0.0.2

xfds 0.3.0