Coqui-STT-Websocket-Streaming

This directory contains a service that can receive audio data over websocket and sends the transcription result using CoquiSTT speech-to-text-engine back to the client. This service can also receive RTP packets and extract the payload (transcription of payload is work in progress). The websocket server code in this project is a modified version of this GitHub project.

Configuration

Server configuration is specified in the application.conf file.

Usage

Starting the server

Git clone the repository
Download and install ffmpeg
Download the acoustic model and language model files for CoquiSTT and place it in the cloned repository
Create a venv using python -m venv venv
Enter venv using venv\scripts\activate (Windows) or source venv/bin/activate (Linux)
Run pip install -r requirements.txt
Run python -m coqui_server.app

Sending requests to server

Websocket

A sample client script is provided, which can be run by executing the following:

coqui_server\client.py 2830-3980-0043.wav

2830-3980-0043.wav can be replaced with a path to the audio file to be transcribed.

The websocket client-server request-response process looks like the following:

Client opens websocket W to server
Client sends binary audio data via W
Server responds with transcribed text via W once transcription process is completed. The server's response is in JSON format
Server closes W

The time t taken by the transcription process depends on several factors, such as the duration of the audio, how busy the service is, etc. Under normal circumstances, t is roughly the same as the duration of the provided audio.

Because this service uses websockets, it is currently not possible to interact with it using certain HTTP clients which do not support websockets, like curl.

RTP

The server can also accept RTP packets. Upon receiving RTP packets, the server decodes the RTP packet to obtain the payload. The payload is then sent to webrtcvad, and the voiced audio frames are sent to CoquiSTT for transcription. The transcription functionality is still work in progress.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.vscode		.vscode
coqui_server		coqui_server
.gitignore		.gitignore
2830-3980-0043.wav		2830-3980-0043.wav
README.md		README.md
application.conf		application.conf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coqui-STT-Websocket-Streaming

Configuration

Usage

Starting the server

Sending requests to server

Websocket

RTP

About

Releases

Packages

Languages

thorweijie/Coqui-STT-Websocket-Streaming

Folders and files

Latest commit

History

Repository files navigation

Coqui-STT-Websocket-Streaming

Configuration

Usage

Starting the server

Sending requests to server

Websocket

RTP

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages