Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can inference be done at FP8? for both 1K and 2K models #128

Open
FurkanGozukara opened this issue Jan 5, 2025 · 10 comments
Open

Can inference be done at FP8? for both 1K and 2K models #128

FurkanGozukara opened this issue Jan 5, 2025 · 10 comments
Labels
Answered Answered the question

Comments

@FurkanGozukara
Copy link

People asking me to further reduce VRAM usage.

Currently 1K model uses 8.7 GB minimum with VAE offloading.

If we could do inference at FP8 that would reduce VRAM usage significantly

I am using official SANA pipeline shared here

@lawrence-cj
Copy link
Collaborator

There will be int4 version of Sana released soon.

@lawrence-cj lawrence-cj added the Answered Answered the question label Jan 5, 2025
@FurkanGozukara
Copy link
Author

There will be int4 version of Sana released soon.

Awesome I am following ❤️

@xieenze
Copy link
Collaborator

xieenze commented Jan 8, 2025

for fp8 you can use torchao toolbox

@FurkanGozukara
Copy link
Author

for fp8 you can use torchao toolbox

how to use with current pipeline here?

@lawrence-cj
Copy link
Collaborator

lawrence-cj commented Jan 10, 2025

@FurkanGozukara
Copy link
Author

@lawrence-cj thanks

is diffusers ready right now with all pull requests you made?

@lawrence-cj
Copy link
Collaborator

What pull request are we talking about?

@FurkanGozukara
Copy link
Author

@lawrence-cj i see none of these are closed they are not merged yet i assume?

huggingface/diffusers#10523

huggingface/diffusers#10510

@lawrence-cj
Copy link
Collaborator

They are just opened within 24 hours. No hurry.

@Shiven-saini
Copy link

waiting for int4 version, will be a game changer! Any way i can contribute to the quantization process ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Answered Answered the question
Projects
None yet
Development

No branches or pull requests

4 participants