fix offload gpu tests etc #10366

yiyixuxu · 2024-12-24T00:57:02Z

this PR:

fix the gpu offload tests
refator sana tranformer so it can work with device map "auto"

HuggingFaceDocBuilderDev · 2024-12-24T01:03:37Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/diffusers/models/transformers/sana_transformer.py

yiyixuxu · 2025-01-13T21:38:34Z

tests/models/test_modeling_common.py

@@ -1080,7 +1080,7 @@ def test_cpu_offload(self):
        torch.manual_seed(0)
        base_output = model(**inputs_dict)

-        model_size = compute_module_persistent_sizes(model)[""]
+        model_size = compute_module_sizes(model)[""]


follow up fix for this https://github.com/huggingface/diffusers/pull/10340/files#r1895134336

a-r-r-o-w

LGTM

src/diffusers/models/transformers/sana_transformer.py

a-r-r-o-w · 2025-01-14T04:22:35Z

src/diffusers/models/transformers/sana_transformer.py

+        self, hidden_states: torch.Tensor, temb: torch.Tensor, scale_shift_table: torch.Tensor
+    ) -> torch.Tensor:
+        hidden_states = self.norm(hidden_states)
+        shift, scale = (scale_shift_table[None] + temb[:, None].to(scale_shift_table.device)).chunk(2, dim=1)


Not really a fan of this kind of device casting in forward but okay to keep it since we don't have better solution yet. These usually end up creating problems for anything that modifies device/dtype with hooks and we then have to use some workarounds.

Going forward, I think nn.Parameter's can be put in their own dummy nn.Module so that device map, or other things we're introducing (like group offloading or fp8 layerwise upcasting), works out of the box (as they will handle the weight/type-casting of inputs in overwritten pre-hook methods). If this sounds good, will do future model integrations with this design

ohh I actually did not think about this at all (I just copied from the original code) - could you explain why do we need this device casting here?

Ah okay, I see. I think I missed it when reviewing the PR that added Sana, otherwise would have probably removed it then. I'm not really sure why it is needed here, and think it might be okay to remove

add

42d3a6a

yiyixuxu force-pushed the fix-max_memory branch from 583a7e9 to 42d3a6a Compare January 13, 2025 21:26

yiyixuxu commented Jan 13, 2025

View reviewed changes

src/diffusers/models/transformers/sana_transformer.py Show resolved Hide resolved

yiyixuxu changed the title ~~[WIP] fix offload gpu tests & a few device_map related refactor~~ [WIP] fix offload gpu tests etc Jan 13, 2025

yiyixuxu commented Jan 13, 2025

View reviewed changes

style

becbcd6

yiyixuxu changed the title ~~[WIP] fix offload gpu tests etc~~ fix offload gpu tests etc Jan 13, 2025

yiyixuxu requested review from DN6 and a-r-r-o-w January 13, 2025 22:10

a-r-r-o-w approved these changes Jan 14, 2025

View reviewed changes

DN6 approved these changes Jan 21, 2025

View reviewed changes

DN6 merged commit a1f9a71 into main Jan 21, 2025
15 checks passed

yiyixuxu deleted the fix-max_memory branch January 21, 2025 17:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix offload gpu tests etc #10366

fix offload gpu tests etc #10366

yiyixuxu commented Dec 24, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 24, 2024

yiyixuxu Jan 13, 2025

a-r-r-o-w left a comment

a-r-r-o-w Jan 14, 2025

yiyixuxu Jan 14, 2025

a-r-r-o-w Jan 15, 2025

fix offload gpu tests etc #10366

fix offload gpu tests etc #10366

Conversation

yiyixuxu commented Dec 24, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Dec 24, 2024

yiyixuxu Jan 13, 2025

Choose a reason for hiding this comment

a-r-r-o-w left a comment

Choose a reason for hiding this comment

a-r-r-o-w Jan 14, 2025

Choose a reason for hiding this comment

yiyixuxu Jan 14, 2025

Choose a reason for hiding this comment

a-r-r-o-w Jan 15, 2025

Choose a reason for hiding this comment

yiyixuxu commented Dec 24, 2024 •

edited

Loading