-
Notifications
You must be signed in to change notification settings - Fork 7.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6969fc9
commit 8115d8c
Showing
2 changed files
with
9 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Flux:
The load diffusion model node will crash comfy if you're using an fp8 version of the transformer and it's set to "default" instead of manually picking the weight_dtype.
Also, the regular load checkpoint node will crash comfy as well if you're trying to load an all-in-one fp8 version that has everything packaged into it, probably for the same reason as the load diffusion model node?
This is on an RTX 2080
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The crash would be because of OOM. Try to increase your page file size or set it to system managed.
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that was it, it was peaking up into the 40something range. I have 32gb of system ram and had reduced my pagefile to 8gb testing something else out last night. Apparently, I forgot I did that. Bumped it up to 16gb and it's working again. I'll probably take it back up to 32gb anyways.
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever switching from one Flux model to another Flux model, upon loading the new model, before any vmem or mem usage change, python will crash, sometimes need logout windows and log in again to recover. (12 vram + 32g ram + 20g pagefile size)
One Flux model -> another Flux model (crash)
One Flux model -> another non-Flux model (Success) -> another Flux model (crash)
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit is slowing down my inference times by %50
#4271 (comment)
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the images are slightly different ( tried with fp8 models ).
What does this commit do ?
Are the models now being upcasted only to fp16, instead of fp32 ?
If so, it doesn't seem to make much difference, in terms of RAM usage.
I still need more than 32GB RAM, when the model is loading.
#4239
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgraded mem to 48G, and the problem was gone.
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is bf16 but it will use fp16 instead of fp32 if your card has poor support for bf16.
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I probably have poor bf16 support ( GTX 1070 )
But using fp16 instead of fp32, doesn't seem to be making any improvements here.
The generation times are slower, and I still need above 32 GB RAM
In terms of image quality, it's about the same.
Some seem slightly better, others slightly worse, depending on the image itself.
But, theoretically, using fp16 instead of fp32, should give worse quality, right ?
8115d8c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, I'd like to be able to choose between upcasting to fp16 or fp32
And fp16 should be usable with less than 32 GB RAM, when loading the models ( without the need for the page file ).