Welcome to the funaudiollm-app repository! This project hosts two exciting applications leveraging advanced audio understand and speech generation models to bring your audio experiences to life:
Voice Chat : This application is designed to provide an interactive and natural chatting experience, making it easier to adopt sophisticated AI-driven dialogues in various settings.
Voice Translation: Break down language barriers with our real-time voice translation tool. This application seamlessly translates spoken language on the fly, allowing for effective and fluid communication between speakers of different languages.
For Details, visit FunAudioLLM Homepage, CosyVoice Paper, FunAudioLLM Technical Report
For CosyVoice
, visit CosyVoice repo and CosyVoice space.
For SenseVoice
, visit SenseVoice repo and SenseVoice space.
Clone and install
- Clone the repo and submodules
git clone --recursive URL
# If you failed to clone submodule due to network failures, please run following command until success
cd funaudiollm-app
git submodule update --init --recursive
-
prepare environments in the submodules according to cosyvoice & sensevoice repo. If you have already prepared the aforementioned resources elsewhere, you can also try modifying the code related to resource path configuration in the app.py file (line 15-18).
-
execute the code below.
pip install -r requirements.txt
prepare
dashscope api token.
voice chat
cd voice_chat
sudo CUDA_VISIBLE_DEVICES="0" DS_API_TOKEN="YOUR-DS-API-TOKEN" python app.py >> ./log.txt
https://YOUR-IP-ADDRESS:60001/
voice translation
cd voice_translation
sudo CUDA_VISIBLE_DEVICES="0" DS_API_TOKEN="YOUR-DS-API-TOKEN" python app.py >> ./log.txt