-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for TTS which fully support Apple Silicon #233
Comments
There's an existing PR that incorporates MPS (#181). The last time I tested it there was no notable improvement in speed. If it requires a nightly build of PyTorch that complicates things, though I did intend to incorporate MPS support once there was no additional work for the user required. If you want to try out that PR and report back on how it works, that would be much appreciated! As for better quality than VITS (like voice p307) I encourage you to try |
@aedocw Thanks. Edge provides great quality. I'm just wondering how the xtts can be further leveraged. I checked out to the mps branch and ran the code again. I'm wondering should I still pass the
|
Ah that's from a contribution someone put in that would skip GPU even if it was found if there was not enough memory. I would search for the line with You probably do have to use |
@aedocw Well for me on a maxed out M3 Macbook, it did make some difference where the number was 2-3it/s but now is around 6-7. With that said, it still took me around 14 hours to have one book, Structures or Why Things Don't Fall Down, converted to the audiobook using XTTS which is abysmal in terms of overall performance. I'll report back my time using Edge but still very much interested in alternative solutions if any on how can it be done more efficiently. I rented a small instance of 1xH100 gpu for an hour and even that wasn't fast enough. |
Using Edge I think a book like that would probably take 3 hours or so, and that is not dependent on your hardware. Using XTTS and a computer with an nVidia GPU, I think that is again about 3 hours. Without full deepspeed compatibility XTTS is unusable in my opinion. |
I don't know if this request fully makes sense but I did some research around like of Piper and Tortoise which have shown the capability to leverage Metal and nightly builds of PyTorch to run more smoothly, let's say compared to XTTS (which requires me to pass the
--no-deepspeed
to even run).So just wondering whether it is possible to integrate such systems into epub2tts for people wanting a better quality than
p307
but also faster generation speeds on a Macbook.The text was updated successfully, but these errors were encountered: