Consolidates SD pipelines and adds support for sharktank unet. #766

monorimet · 2024-07-09T17:28:02Z

No description provided.

aviator19941 · 2024-07-09T17:38:58Z

models/turbine_models/custom_models/sd_inference/vae.py

+    ):
+        fxb = FxProgramsBuilder(vae_model)
+
+        # @fxb.export_program(args=(encode_args,))


remove commented code

aviator19941 · 2024-07-09T17:42:54Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+            base_params=None,
+        )
+    else:
+        ds = None  # get_punet_dataset(results["config.json"], results["params.safetensors"], base_params=None)


Why is this commented out?

Haven't implemented it yet. just ran i8 straight through and left some spots for the other configurations. Will replace with TODOs or just finish it after higher priority items

aviator19941 · 2024-07-09T17:52:19Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+            output = export(
+                unet_model,
+                kwargs=example_forward_args_dict,
+                module_name="compiled_unet",


can this be compiled_punet or no?

stellaraccident · 2024-07-09T18:02:47Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+    quant_params_struct=None,
+    base_params=None,
+):
+    from sharktank.models.punet.tools.import_brevitas_dataset import (


Please don't do it this way. Just import the main function and pass it CL args vs using the private internals.

stellaraccident · 2024-07-09T18:07:08Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+
+    # TODO: Post-process to introduce fused cross-layer connections.
+
+    ds.save(output_path, io_report_callback=print)


There is a bug in the Dataset handling code where if you construct a Dataset on the fly and then use that without loading it from disk, it does not annotate the imported tensors, resulting in them being inlined vs referenced. I need to fix that bug, but the workflow in this case shouldn't be doing this anyway.

If you follow the advice in the comment above and use the main() entrypoint to generate the IRPA on disk, then do Dataset.load(irpa_path), that will work.

monorimet · 2024-07-10T04:05:38Z

needs iree-org/iree-turbine#40 to land for some metadata pass fixes on clip.

stellaraccident

Left a number of comments that would be good to address here or in a followup.

stellaraccident · 2024-07-10T04:19:27Z

models/turbine_models/custom_models/sd_inference/tokenization.py

-        no_boseos_middle=no_boseos_middle,
-        chunk_length=pipe.model_max_length,
-    )
+            print("The following text was removed due to truncation:", removed_text)


Did you mean to use print like this? Generally not great to spew to stdout from a library. Maybe use warnings.warn(...) instead?

stellaraccident · 2024-07-10T04:20:27Z

models/turbine_models/custom_models/sd_inference/unet.py

+        torch.empty(1, dtype=dtype),
+    ]
+    decomp_list = []
+    if decomp_attn == True:


Why not just if decomp_attn:?

stellaraccident · 2024-07-10T04:21:59Z

models/turbine_models/custom_models/sd_inference/utils.py

@@ -331,7 +340,8 @@ def get_wmma_spec_path(target_chip, save_dir, masked_attention=False):
    else:
        return None
    attn_spec = urlopen(url).read().decode("utf-8")


I suppose this isn't the only place where we are fetching network resources, but wish we weren't doing this... some day.

stellaraccident · 2024-07-10T04:24:59Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+        repo_id = "amd-shark/sdxl-quant-models"
+        subfolder = "unet/int8"
+        revision = "82e06d6ea22ac78102a9aded69e8ddfb9fa4ae37"
+    elif precision in ["fp16", "fp32"]:


FYI - the sdxl-quant-models do have an fp16 variant IRPA file. I use that for various comparisons, etc.

https://huggingface.co/amd-shark/sdxl-quant-models/tree/main/unet/fp16/export

I guess we can decide whether we want to do the full model download/quantization here or just fetch the IRPA file from the export subdir. If we did the latter, you could drop get_punet_dataset() that is actually building it and just use the one we publish. If doing that, you can eliminate this branch and just change the subfolder and filename or something.

Fine to check in like this since it is working, but can simplify by leaning more on published artifacts.

Now that I'm thinking of it, though, we're going to need a procedure for ML-Perf to rely completely on generated artifacts. Currently, that is being done by providing scripts to regenerate, but then we probably also need to support flags here and elsewhere to side-load.

In any case, let's land and discuss/iterate.

stellaraccident · 2024-07-10T04:32:28Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+    }
+    if precision == "i8":
+        results["quant_params.json"] = download("quant_params.json")
+        output_path = external_weight_path.split("unet")[0] + "punet_dataset_i8.irpa"


What would happen if this were in the directory C:\Users\unet\Downloads? Would be better to use pathlib for manipulation of just the stem or something.

stellaraccident · 2024-07-10T04:33:11Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+def get_punet_dataset(
+    config_json_path,
+    params_path,
+    output_path="./punet_dataset_i8.irpa",


Remove the default. Should always be provided, I think.

stellaraccident · 2024-07-10T04:34:01Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+    output_path="./punet_dataset_i8.irpa",
+    quant_params_path=None,
+    quant_params_struct=None,
+    base_params=None,


Seems not used/remove (this was a hack/prototype thing that didn't work out anyway).

stellaraccident · 2024-07-10T04:34:21Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+    params_path,
+    output_path="./punet_dataset_i8.irpa",
+    quant_params_path=None,
+    quant_params_struct=None,


Not used/remove?

stellaraccident · 2024-07-10T04:36:09Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+        [
+            f"--config-json={config_json_path}",
+            f"--params={params_path}",
+            f"--quant-params={quant_params_path}",


I don't think this will work for the fp16 version like this (the literal value "None" will try to load a file named "None").

There is an importer right next to this for a stock huggingface dataset which was intended to be used for this case. Should probably switch here, but judging from above, I'm guessing that fp16 punet has not been tested yet. Can flush when it is.

stellaraccident · 2024-07-10T04:36:44Z

models/turbine_models/custom_models/sdxl_inference/unet.py

+    elif (not decomp_attn) and "gfx9" in target:
+        attn_spec = "mfma"
+    elif (not decomp_attn) and "gfx11" in target:
+        attn_spec = "wmma"


This kind of thing makes me sad. Nothing to do about it now.

monorimet requested review from aviator19941 and IanNod July 9, 2024 17:28

aviator19941 reviewed Jul 9, 2024

View reviewed changes

stellaraccident reviewed Jul 9, 2024

View reviewed changes

monorimet and others added 6 commits July 9, 2024 15:10

Adds one SD pipeline to rule them all

ade5b3a

Initial punet integration.

c2d8d5f

Formatting.

a9ead64

Partitioned Unet I8 support.

0147cc9

Fixes for non-punet attn spec

4bc57c2

Fixes to punet, device IDs

34d3d84

monorimet force-pushed the ean-unify-sd-staging branch from 9825e66 to 34d3d84 Compare July 9, 2024 20:12

monorimet requested review from aviator19941 and stellaraccident July 9, 2024 20:12

eagarvey-amd and others added 3 commits July 9, 2024 15:15

Add TODO to encode comment in VAE export script.

86f3a80

Test fixes, reduce unet model signature variance

4c01167

Update test_shark.yml

22eb78c

monorimet changed the title ~~sharktank UNet model support.~~ Consolidates unet pipelines and adds support for sharktank unet. Jul 10, 2024

monorimet changed the title ~~Consolidates unet pipelines and adds support for sharktank unet.~~ Consolidates SD pipelines and adds support for sharktank unet. Jul 10, 2024

stellaraccident approved these changes Jul 10, 2024

View reviewed changes

eagarvey-amd added 3 commits July 10, 2024 00:34

Use hip driver unless rocm-legacy is used as device string.

4d8960d

Switch device args to hip in CI and xfail prompt encoder test for now.

e8de325

Address comments

708b8d0

monorimet merged commit d534cd4 into ean-unify-sd Jul 10, 2024
2 of 3 checks passed

monorimet deleted the ean-unify-sd-staging branch July 10, 2024 08:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidates SD pipelines and adds support for sharktank unet. #766

Consolidates SD pipelines and adds support for sharktank unet. #766

monorimet commented Jul 9, 2024

aviator19941 Jul 9, 2024

aviator19941 Jul 9, 2024

monorimet Jul 9, 2024

aviator19941 Jul 9, 2024

monorimet Jul 9, 2024

stellaraccident Jul 9, 2024

stellaraccident Jul 9, 2024

monorimet commented Jul 10, 2024

stellaraccident left a comment

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024

stellaraccident Jul 10, 2024


		# TODO: Post-process to introduce fused cross-layer connections.

		ds.save(output_path, io_report_callback=print)

Consolidates SD pipelines and adds support for sharktank unet. #766

Consolidates SD pipelines and adds support for sharktank unet. #766

Conversation

monorimet commented Jul 9, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

monorimet commented Jul 10, 2024

stellaraccident left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment