Usage in a Python script #638

Planze · 2024-12-17T11:49:03Z

Planze
Dec 17, 2024

Hello fellow tts enthusiasts!

I'm working on a project using F5-TTS for AI voice assistant and I'm curious about how others are using F5-TTS. So I came here to ask how do you guys integrate F5-TTS into your python scripts? Also, any tips on speeding up the inference process would be awesome!

In another thread [(https://github.com//issues/224)] I found these suggestions for speeding up by SWivid:

"Use less nfe_step for speed-quality trade-off.
Try distillation techniques.
Train a smaller model from scratch if single-language needed application scenario."

When used in a script where can you define the nfe_step parameter? I tried changing it in my toml. file but it did not seem to make any speed difference. With the help of many youtube tutorials I also managed to train my own model and then defined it as ckpt file in the toml. file, I'd like to ask you guys if this is the correct way to do it?

Speed was the same even though I trained it 10-15 times smaller than the default one. So I figured some speed gains are lost when the model is loaded into memory and F5-TTS is starting the other processes. Gradio also works much faster than the python code, 2-3 seconds on gradio vs 15 in python, with the new model.

Basically with my little python knowledge I wrapped this up into a function that takes incoming string and then translates that to speech. Though soon I realized this is not the most optimal way since the next time the AI outputs something F5-TTS has to start again. How would one keep F5-TTS running and ready prepared for the next input?

This become a long one! Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage in a Python script #638

{{title}}

Replies: 0 comments

Select a reply

Usage in a Python script #638

Planze Dec 17, 2024

Replies: 0 comments

Planze
Dec 17, 2024