Skip to content

Commit

Permalink
Support loading and using SkyReels-V1-Hunyuan-I2V (#6862)
Browse files Browse the repository at this point in the history
* Support SkyReels-V1-Hunyuan-I2V

* VAE scaling

* Fix T2V

oops

* Proper latent scaling
  • Loading branch information
kijai authored Feb 18, 2025
1 parent b07258c commit acc152b
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 2 deletions.
2 changes: 1 addition & 1 deletion comfy/ldm/hunyuan_video/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ def block_wrap(args):
shape[i] = shape[i] // self.patch_size[i]
img = img.reshape([img.shape[0]] + shape + [self.out_channels] + self.patch_size)
img = img.permute(0, 4, 1, 5, 2, 6, 3, 7)
img = img.reshape(initial_shape)
img = img.reshape(initial_shape[0], self.out_channels, initial_shape[2], initial_shape[3], initial_shape[4])
return img

def forward(self, x, timestep, context, y, guidance=None, attention_mask=None, control=None, transformer_options={}, **kwargs):
Expand Down
9 changes: 9 additions & 0 deletions comfy/model_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -871,6 +871,15 @@ def extra_conds(self, **kwargs):
if cross_attn is not None:
out['c_crossattn'] = comfy.conds.CONDRegular(cross_attn)

image = kwargs.get("concat_latent_image", None)
noise = kwargs.get("noise", None)

if image is not None:
padding_shape = (noise.shape[0], 16, noise.shape[2] - 1, noise.shape[3], noise.shape[4])
latent_padding = torch.zeros(padding_shape, device=noise.device, dtype=noise.dtype)
image_latents = torch.cat([image.to(noise), latent_padding], dim=2)
out['c_concat'] = comfy.conds.CONDNoiseShape(self.process_latent_in(image_latents))

guidance = kwargs.get("guidance", 6.0)
if guidance is not None:
out['guidance'] = comfy.conds.CONDRegular(torch.FloatTensor([guidance]))
Expand Down
2 changes: 1 addition & 1 deletion comfy/model_detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ def detect_unet_config(state_dict, key_prefix):
if '{}txt_in.individual_token_refiner.blocks.0.norm1.weight'.format(key_prefix) in state_dict_keys: #Hunyuan Video
dit_config = {}
dit_config["image_model"] = "hunyuan_video"
dit_config["in_channels"] = 16
dit_config["in_channels"] = state_dict["img_in.proj.weight"].shape[1] #SkyReels img2video has 32 input channels
dit_config["patch_size"] = [1, 2, 2]
dit_config["out_channels"] = 16
dit_config["vec_in_dim"] = 768
Expand Down

3 comments on commit acc152b

@brendanhoar
Copy link
Contributor

@brendanhoar brendanhoar commented on acc152b Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems some hunyuanvideo models do not have img_in.proj.weight in their state_dict and this is causing a load failure for them. After this batch is done I'll try to reproduce.

hmm, it does seem to have it. I need to look at the error again...
"model.model.img_in.proj.weight": {
"dtype": "F8_E4M3",
"shape": [3072, 16, 1, 2, 2],
"data_offsets": [8617924974, 8618121582]
},

This is the error I am seeing:

02:06:44.479 [Debug] [ComfyUI-0/STDERR] !!! Exception during processing !!! 'img_in.proj.weight'
02:06:44.482 [Warning] [ComfyUI-0/STDERR] Traceback (most recent call last):
02:06:44.484 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 327, in execute
02:06:44.485 [Warning] [ComfyUI-0/STDERR] output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
02:06:44.487 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
02:06:44.488 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 202, in get_output_data
02:06:44.490 [Warning] [ComfyUI-0/STDERR] return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
02:06:44.492 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
02:06:44.494 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 174, in _map_node_over_list
02:06:44.496 [Warning] [ComfyUI-0/STDERR] process_inputs(input_dict, i)
02:06:44.498 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\execution.py", line 163, in process_inputs
02:06:44.499 [Warning] [ComfyUI-0/STDERR] results.append(getattr(obj, func)(**inputs))
02:06:44.501 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
02:06:44.503 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\nodes.py", line 570, in load_checkpoint
02:06:44.504 [Warning] [ComfyUI-0/STDERR] out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"))
02:06:44.505 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
02:06:44.507 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\comfy\sd.py", line 858, in load_checkpoint_guess_config
02:06:44.508 [Warning] [ComfyUI-0/STDERR] out = load_state_dict_guess_config(sd, output_vae, output_clip, output_clipvision, embedding_directory, output_model, model_options, te_model_options=te_model_options)
02:06:44.509 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
02:06:44.510 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\comfy\sd.py", line 875, in load_state_dict_guess_config
02:06:44.511 [Warning] [ComfyUI-0/STDERR] model_config = model_detection.model_config_from_unet(sd, diffusion_model_prefix)
02:06:44.514 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
02:06:44.515 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\comfy\model_detection.py", line 437, in model_config_from_unet
02:06:44.517 [Warning] [ComfyUI-0/STDERR] unet_config = detect_unet_config(state_dict, unet_key_prefix)
02:06:44.519 [Warning] [ComfyUI-0/STDERR] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
02:06:44.521 [Warning] [ComfyUI-0/STDERR] File "G:___all_webuis\SwarmUI\dlbackend\comfy\ComfyUI\comfy\model_detection.py", line 139, in detect_unet_config
02:06:44.524 [Warning] [ComfyUI-0/STDERR] dit_config["in_channels"] = state_dict["img_in.proj.weight"].shape[1] #SkyReels img2video has 32 input channels
02:06:44.526 [Warning] [ComfyUI-0/STDERR] ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
02:06:44.529 [Warning] [ComfyUI-0/STDERR] KeyError: 'img_in.proj.weight'

02:06:44.530 [Warning] [ComfyUI-0/STDERR]

@maedtb
Copy link
Contributor

@maedtb maedtb commented on acc152b Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brendanhoar : This issue should be addressed by #6877, if you want to try that in the meantime.

@kijai
Copy link
Contributor Author

@kijai kijai commented on acc152b Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm sorry about that, out of curiosity which model is that? All the models I had worked so I didn't catch that in my initial testing.

Please sign in to comment.