Adds SDXL support and CI testing, benchmarks. #271

monorimet · 2023-12-18T17:28:36Z

No description provided.

aviator19941

LGTM

dan-garvey · 2023-12-18T19:06:16Z

test failure seems real

aviator19941 · 2023-12-18T20:29:47Z

@monorimet I think you need to update the run_unet function with guidance_scale in the definition, add it as an ireert.asdevicearray arg, and update the run_unet test to use guidance_scale instead of the run_torch_unet function.

monorimet · 2023-12-19T06:41:51Z

@monorimet I think you need to update the run_unet function with guidance_scale in the definition, add it as an ireert.asdevicearray arg, and update the run_unet test to use guidance_scale instead of the run_torch_unet function.

Oops. thanks for taking a look. I'll fix it shortly.

monorimet · 2023-12-19T07:25:45Z

There's a correctness difference. I probably need to update the torch implementation. I'll come back to it tomorrow.

IanNod · 2024-01-19T22:09:00Z

python/turbine_models/tests/sd_test.py

-        os.remove("stable_diffusion_v1_4_clip.safetensors")
-        os.remove("stable_diffusion_v1_4_clip.vmfb")
+        # os.remove(f"{arguments['safe_model_name']}_clip.safetensors")
+        # os.remove(f"{arguments['safe_model_name']}_clip.vmfb")


Why are these commented out?

Some race condition between this and the actual test execution. Will leave these comments unresolved until test file mgmt is fixed.

IanNod · 2024-01-19T22:14:15Z

python/turbine_models/tests/sd_test.py

-        os.remove("stable_diffusion_v1_4_unet.safetensors")
-        os.remove("stable_diffusion_v1_4_unet.vmfb")
+        # os.remove(f"{arguments['safe_model_name']}_unet.safetensors")
+        # os.remove(f"{arguments['safe_model_name']}_unet.vmfb")


same as above

IanNod · 2024-01-19T22:14:25Z

python/turbine_models/tests/sd_test.py

-        os.remove("stable_diffusion_v1_4_vae.safetensors")
-        os.remove("stable_diffusion_v1_4_vae.vmfb")
+        # os.remove(f"{arguments['safe_model_name']}_vae.safetensors")
+        # os.remove(f"{arguments['safe_model_name']}_vae.vmfb")


IanNod · 2024-01-19T22:14:34Z

python/turbine_models/tests/sd_test.py

-        os.remove("stable_diffusion_v1_4_vae.safetensors")
-        os.remove("stable_diffusion_v1_4_vae.vmfb")
+        # os.remove(f"{arguments['safe_model_name']}_vae.safetensors")
+        # os.remove(f"{arguments['safe_model_name']}_vae.vmfb")


IanNod · 2024-01-19T22:15:11Z

python/turbine_models/tests/sdxl_test.py

+
+
+class StableDiffusionTest(unittest.TestCase):
+    # def testExportClipModel(self):


Lots of commented code here?

IanNod · 2024-01-19T22:18:34Z

python/turbine_models/custom_models/sdxl_inference/unet.py

@@ -0,0 +1,190 @@
+# Copyright 2023 Nod Labs, Inc


Looks like a lot of repeated code from sd_inference/unet.py. Can we combine the two?

IanNod · 2024-01-19T22:19:00Z

python/turbine_models/custom_models/sdxl_inference/unet_runner.py

@@ -0,0 +1,163 @@
+import argparse


combine with other unet_runner?

need to robustify without SDPA decomps and add SDXL scheduler examples

dan-garvey

didnt they fix sdpa torch lowering?

monorimet · 2024-02-29T08:34:08Z

didnt they fix sdpa torch lowering?

yes and no.

with lowering to iree_linalg_ext.attention we have numerics issues on CPU after tileanddecomposeattention and pending implementation for vulkan, tiling for shared memory on ROCM -- see iree-org/iree#16421

so the lowering is OK from torch->linalg but the hal backends need tile and decompose better.

monorimet · 2024-03-07T15:50:06Z

I think we need this commit to be ported over to the upstream fx importer 30ef1fc
@aviator19941 / @PhaneeshB can you help with this

IanNod

There seems to be a lot of repeated code between SD and SDXL. Doesn't need to be done now but can you add todo's or create an issue to work on combining the two to get rid of that?

.github/workflows/test_models.yml

IanNod · 2024-04-10T18:26:52Z

core/shark_turbine/aot/builtins/jittable.py

@@ -214,6 +214,13 @@ def flat_wrapped_f(*args):
        if "functorch_functionalize" in self._passes:
            transformed_f = functorch_functionalize(transformed_f, *flat_pytorch_args)

+        for node in transformed_f.graph.nodes:  # type: ignore


isn't this being done in fx_importer.py? if we need it here instead should we remove it there?

Seems to be covered by torch-mlir's fx importer. Removing it here.

IanNod · 2024-04-10T18:27:01Z

core/shark_turbine/aot/builtins/jittable.py

+        for node in transformed_f.graph.nodes:  # type: ignore
+            if node.op == "call_function":
+                if node.target == torch._ops.ops.aten.lift_fresh_copy.default:
+                    print(f"replaced lift_fresh_copy")


remove printout?

IanNod · 2024-04-10T18:31:25Z

models/turbine_models/custom_models/sd_inference/sdxl_split_schedulers.py

+
+    decomp_list = DEFAULT_DECOMPOSITIONS
+
+    decomp_list.extend(


shouldn't SDPA decomp be flag guarded?

IanNod · 2024-04-10T18:31:38Z

models/turbine_models/custom_models/sd_inference/sdxl_split_schedulers.py

+            torch.ops.aten._scaled_dot_product_flash_attention.default,
+        ]
+    )
+    # encoder_hidden_states_sizes = (2, 77, 768)


remove commented code?

IanNod · 2024-04-10T18:39:17Z

models/turbine_models/custom_models/sd_inference/utils.py

 )

+winograd_params = "keys=unet.down_blocks.2.resnets.0.conv2.weight keys=unet.down_blocks.2.resnets.1.conv1.weight keys=unet.down_blocks.2.resnets.1.conv2.weight keys=unet.mid_block.resnets.0.conv1.weight keys=unet.mid_block.resnets.0.conv2.weight keys=unet.mid_block.resnets.1.conv1.weight keys=unet.mid_block.resnets.1.conv2.weight keys=unet.up_blocks.0.resnets.0.conv2.weight keys=unet.up_blocks.0.resnets.1.conv2.weight keys=unet.up_blocks.0.resnets.2.conv2.weight keys=unet.up_blocks.0.resnets.0.conv1.weight keys=unet.up_blocks.0.resnets.1.conv1.weight keys=unet.up_blocks.0.resnets.2.conv1.weight keys=unet.up_blocks.0.upsamplers.0.conv.weight"


We probably shouldn't have all of these winograd_params and flags hardcoded. Probably fine for now but might want a todo to address this

IanNod · 2024-04-10T18:42:09Z

models/turbine_models/custom_models/sd_inference/vae.py

@@ -93,6 +118,15 @@ def export_vae_model(
    upload_ir=False,
 ):
    mapper = {}
+    decomp_list = DEFAULT_DECOMPOSITIONS
+    decomp_list.extend(


flag guard these decomps?

IanNod · 2024-04-10T18:46:30Z

models/turbine_models/custom_models/sdxl_inference/sdxl_prompt_encoder.py

+        )
+        self.do_classifier_free_guidance = do_classifier_free_guidance
+
+    #     self.tokenizer_1 = CLIPTokenizer.from_pretrained(


commented code

IanNod · 2024-04-10T18:47:43Z

models/turbine_models/custom_models/sdxl_inference/sdxl_scheduled_unet.py

+    full_pipeline_file = (
+        pipe_prefix + "f32" if args.precision == "fp32" else pipe_prefix + "f16"
+    )
+    # pipeline_vmfb_path = utils.compile_to_vmfb(


commented code

IanNod · 2024-04-10T18:48:23Z

models/turbine_models/custom_models/sdxl_inference/sdxl_schedulers.py

+            torch.ops.aten._scaled_dot_product_flash_attention.default,
+        ]
+    )
+    # encoder_hidden_states_sizes = (2, 77, 768)


commented code

IanNod

LGTM!

monorimet requested review from aviator19941 and IanNod December 18, 2023 17:28

aviator19941 previously approved these changes Dec 18, 2023

View reviewed changes

monorimet force-pushed the ean-sd-fp16 branch 2 times, most recently from 155ae04 to 4498486 Compare January 5, 2024 06:27

monorimet force-pushed the ean-sd-fp16 branch from c1aebad to 1cec8dd Compare January 19, 2024 18:41

stellaraccident force-pushed the main branch from 5443da5 to 090f359 Compare January 19, 2024 18:56

IanNod reviewed Jan 19, 2024

View reviewed changes

monorimet changed the title ~~Add precision, max_length to unet, vae and guidance scale as input to unet.~~ (WIP) Add some parameters to UNet export/tests + SDXL Jan 22, 2024

monorimet force-pushed the ean-sd-fp16 branch 2 times, most recently from b1459fc to a0879c7 Compare February 9, 2024 20:40

monorimet force-pushed the ean-sd-fp16 branch 2 times, most recently from 4e2801f to 14fc107 Compare February 14, 2024 23:40

This was referenced Feb 15, 2024

GRAPH: Analyze SDXL Turbo model graph and list graph-level issues #418

Closed

MODEL: SDXL issues tracking. #462

Open

monorimet changed the title ~~(WIP) Add some parameters to UNet export/tests + SDXL~~ Adds SDXL support and CI testing, benchmarks. Feb 23, 2024

monorimet force-pushed the ean-sd-fp16 branch from 0d5e913 to 0b66db8 Compare February 27, 2024 08:03

dan-garvey reviewed Feb 27, 2024

View reviewed changes

monorimet force-pushed the ean-sd-fp16 branch 2 times, most recently from 56a8bfe to 4b28a12 Compare March 6, 2024 23:17

one-lithe-rune mentioned this pull request Mar 7, 2024

(Studio 2.0) add Stable Diffusion features nod-ai/SHARK-Studio#2037

Merged

monorimet force-pushed the ean-sd-fp16 branch from ec45b50 to 93812b7 Compare March 7, 2024 15:41

monorimet force-pushed the ean-sd-fp16 branch from f4f6261 to df1d410 Compare March 7, 2024 17:11

monorimet force-pushed the ean-sd-fp16 branch 2 times, most recently from fa6ba50 to 89274eb Compare April 9, 2024 15:52

cleanup pipeline test artifacts after completion.

0011328

monorimet force-pushed the ean-sd-fp16 branch from 89274eb to 0011328 Compare April 9, 2024 15:53

monorimet added 2 commits April 9, 2024 11:19

restrict wmma flags to gfx94X

4d7bfef

Decompose attention in CI tests.

946a02f

monorimet force-pushed the ean-sd-fp16 branch 3 times, most recently from 80fe4ce to df1002e Compare April 9, 2024 18:42

Pipe through attn spec option correctly.

77d4308

monorimet force-pushed the ean-sd-fp16 branch from df1002e to 77d4308 Compare April 9, 2024 18:44

monorimet and others added 4 commits April 9, 2024 14:21

Use fp16 for mi210 CI.

3336b6b

Fix default attention spec behavior

68c3c6c

Update test_models.yml

8dc1fba

xfail e2e on rocm shortly, pending move to nightly test

bfbebef

monorimet requested review from IanNod and aviator19941 April 10, 2024 05:19

IanNod requested changes Apr 10, 2024

View reviewed changes

monorimet and others added 3 commits April 10, 2024 18:54

Merge branch 'main' into ean-sd-fp16

246d32d

use config A for cpu CI

bb2c7e0

Remove xfails on submodels for rocm.

12b91f4

monorimet requested a review from IanNod April 11, 2024 16:00

monorimet force-pushed the ean-sd-fp16 branch from f5f60e0 to a3939e8 Compare April 11, 2024 16:02

Cleanup comments and redundant code.

575bcd0

monorimet force-pushed the ean-sd-fp16 branch from a3939e8 to 575bcd0 Compare April 11, 2024 16:06

Skip tests that crash on MI210 for now.

eaeb646

monorimet force-pushed the ean-sd-fp16 branch from cfb63ef to eaeb646 Compare April 11, 2024 19:35

IanNod approved these changes Apr 11, 2024

View reviewed changes

monorimet merged commit 1dea19e into main Apr 11, 2024
8 checks passed

monorimet deleted the ean-sd-fp16 branch April 11, 2024 20:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds SDXL support and CI testing, benchmarks. #271

Adds SDXL support and CI testing, benchmarks. #271

monorimet commented Dec 18, 2023 •

edited

Loading

aviator19941 left a comment

dan-garvey commented Dec 18, 2023

aviator19941 commented Dec 18, 2023

monorimet commented Dec 19, 2023

monorimet commented Dec 19, 2023

IanNod Jan 19, 2024

monorimet Jan 22, 2024

IanNod Jan 19, 2024

IanNod Jan 19, 2024

IanNod Jan 19, 2024

IanNod Jan 19, 2024

IanNod Jan 19, 2024

IanNod Jan 19, 2024

dan-garvey left a comment

monorimet commented Feb 29, 2024 •

edited

Loading

monorimet commented Mar 7, 2024

IanNod left a comment

IanNod Apr 10, 2024

monorimet Apr 11, 2024

IanNod Apr 10, 2024

IanNod Apr 10, 2024

IanNod Apr 10, 2024

IanNod Apr 10, 2024

IanNod Apr 10, 2024

IanNod Apr 10, 2024

IanNod Apr 10, 2024

IanNod Apr 10, 2024

IanNod left a comment



		class StableDiffusionTest(unittest.TestCase):
		# def testExportClipModel(self):

		)

		winograd_params = "keys=unet.down_blocks.2.resnets.0.conv2.weight keys=unet.down_blocks.2.resnets.1.conv1.weight keys=unet.down_blocks.2.resnets.1.conv2.weight keys=unet.mid_block.resnets.0.conv1.weight keys=unet.mid_block.resnets.0.conv2.weight keys=unet.mid_block.resnets.1.conv1.weight keys=unet.mid_block.resnets.1.conv2.weight keys=unet.up_blocks.0.resnets.0.conv2.weight keys=unet.up_blocks.0.resnets.1.conv2.weight keys=unet.up_blocks.0.resnets.2.conv2.weight keys=unet.up_blocks.0.resnets.0.conv1.weight keys=unet.up_blocks.0.resnets.1.conv1.weight keys=unet.up_blocks.0.resnets.2.conv1.weight keys=unet.up_blocks.0.upsamplers.0.conv.weight"

Adds SDXL support and CI testing, benchmarks. #271

Adds SDXL support and CI testing, benchmarks. #271

Conversation

monorimet commented Dec 18, 2023 • edited Loading

aviator19941 left a comment

Choose a reason for hiding this comment

dan-garvey commented Dec 18, 2023

aviator19941 commented Dec 18, 2023

monorimet commented Dec 19, 2023

monorimet commented Dec 19, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dan-garvey left a comment

Choose a reason for hiding this comment

monorimet commented Feb 29, 2024 • edited Loading

monorimet commented Mar 7, 2024

IanNod left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IanNod left a comment

Choose a reason for hiding this comment

monorimet commented Dec 18, 2023 •

edited

Loading

monorimet commented Feb 29, 2024 •

edited

Loading