This is a Python application built using PyQt and Stable-TS to provide real time subtitling for any audio input (or output) audio device. The application heavly relies on the Stable-TS library to provide the transcription of audio. It is a work in progress and is not yet ready for production use.
As of currently, the application requires a CUDA enabled GPU to run. Apple Sllicon and AMD GPUs may work, but have not been tested on. The application is not yet optimized for CPU only usage. Almost any modern NVIDIA GPU should work, with CUDA toolkit installed.
- Real-Time Audio Transcription: Utilize the power of Stable-TS for instant audio to text conversion, ensuring litte lag between speech and subtitles.
- Audio Device Selection: Choose any connected audio input or output device for transcription, offering flexibility for various use cases.
- Linux Compatibility: Optimized for Linux environments, ensuring smooth operation and integration with your system's audio infrastructure.
- Offline Mode: Use the application without an active internet connection, ensuring privacy and security for sensitive audio data. Before using the application in offline mode, you need to use the application in online mode once to download the required models.
- Download the installation script and make it executable.
wget https://raw.githubusercontent.com/KUKHUA/ui-sub-live-rewrite/main/scripts/installer.sh
chmod +x installer.sh
- Run the installation script.
./installer.sh
Not yet supported.
- Clone the repository to your local machine.
git clone https://github.com/KUKHUA/ui-sub-live-rewrite.git
- Install the required dependencies. It is recommended to use a virtual environment to avoid conflicts with other Python packages.
# Create a virtual environment
python -m venv RealTime-TS
# For Linux or MacOS:
# source RealTime-TS/bin/activate
# For Windows:
# .\RealTime-TS\Scripts\activate
# Navigate to the project directory
cd RealTime-TS-V2
# Change the ending of the file depending on your operating system (if it exists).
# Install the required dependencies
pip install -r requirements-linux.txt
- Run the application.
python main.py
- If you get any errors, please check the terminal output and install the required dependencies.
-
Select a backend, "Stable-TS Transformers" is recommended for most use cases.
-
Select a working folder, this is where the application will store the subtile output as a .TXT file.
-
Select a transcription model, the default model is the best for most use cases.
-
Check the VAD (Voice Activity Detection) box if you want the application to enable VAD.
-
Select a denoiser model, a denoiser model removes background noise, such as music or traffic.
-
Select a batch size, the batch size is the number of audio frames processed at once. Considering the audio data is short, a smaller batch size is recommended.
-
Select an audio device, this is the audio device the application will transcribe.