code for finetuning 1024 model. #104

SunzeY · 2024-03-23T10:08:08Z

SunzeY
Mar 23, 2024

I follow the instruction to finetune 512px and it works well. However, When switch to 1024px-MS version, it doesn't produce reasonable result. Is it possible to share a official implementation to finetune 1024px-MS version of your work? This will be really helpful!

Answered by ruizhaocv

Mar 25, 2024

I think this is caused by the wrong positional embedding. Set the lewei_scale to 2.0 in the config file and have a try!

View full answer

SunzeY · 2024-03-23T15:32:39Z

SunzeY
Mar 23, 2024
Author

This really drives me crazy! I have tried anything I can. All I do is change the config of Pokemon-example:

change config

_base_ = ['PixArt-alpha/configs/pixart_config/PixArt_xl2_img1024_internalms.py']
...
data = dict(type='InternalDataMS'...
model = 'PixArtMS_XL_2'
...

I train the model and convert pixart model to diffuser exactly follow the same pipeline as 512px base fine-tuning pipeline. However, my result always looks like this:

However, 512px base model works well normally...
Is there anything important missing that makes Pixart-MS-1024 different from Pixart-512? I would really appreciate it if official fine-tuning demo on MS-1024 will be made available!

0 replies

SunzeY · 2024-03-23T15:53:26Z

SunzeY
Mar 23, 2024
Author

May be there is something wrong with learnable scale and ratio embedding? Should I keep them fixed during training?

0 replies

youngwanLEE · 2024-03-24T12:10:35Z

youngwanLEE
Mar 24, 2024

@SunzeY Same situation; I'm waiting for the author's response, too.

1 reply

SunzeY Mar 25, 2024
Author

I will email their author to try to get feedback from them.

ruizhaocv · 2024-03-25T06:00:58Z

ruizhaocv
Mar 25, 2024

I think this is caused by the wrong positional embedding. Set the lewei_scale to 2.0 in the config file and have a try!

1 reply

SunzeY Mar 31, 2024
Author

Thank you so much, this solve my problem!

MingzhouZhang · 2024-03-26T03:31:20Z

MingzhouZhang
Mar 26, 2024

In inference.py, no-ema checkpoint is loaded, while fine-tuning ema checkpoint is saved in 'state_dict_ema' key. Replace with the following code solved my problem.
missing, unexpected = model.load_state_dict(state_dict['state_dict_ema'], strict=False)

PixArt-alpha/scripts/inference.py

Line 158 in 82c8559

    
           missing, unexpected = model.load_state_dict(state_dict['state_dict'], strict=False)

1 reply

SunzeY Apr 20, 2024
Author

hi, I have a small question for this. if state_dict['state_dict_ema']'s weight is better. Why not their official implementation of converting pth into hugging face format load ema shadow but still use state_dict['state_dict].

heyalexchoi · 2024-03-28T23:45:22Z

heyalexchoi
Mar 28, 2024

Does your config continue from the 1024 checkpoint? I had that issue.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PixArt

code for finetuning 1024 model. #104

{{title}}

Replies: 6 comments 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

PixArt

code for finetuning 1024 model. #104

SunzeY Mar 23, 2024

Replies: 6 comments · 3 replies

SunzeY Mar 23, 2024 Author

SunzeY Mar 23, 2024 Author

youngwanLEE Mar 24, 2024

SunzeY Mar 25, 2024 Author

ruizhaocv Mar 25, 2024

SunzeY Mar 31, 2024 Author

MingzhouZhang Mar 26, 2024

SunzeY Apr 20, 2024 Author

heyalexchoi Mar 28, 2024

SunzeY
Mar 23, 2024

Replies: 6 comments 3 replies

SunzeY
Mar 23, 2024
Author

SunzeY
Mar 23, 2024
Author

youngwanLEE
Mar 24, 2024

SunzeY Mar 25, 2024
Author

ruizhaocv
Mar 25, 2024

SunzeY Mar 31, 2024
Author

MingzhouZhang
Mar 26, 2024

SunzeY Apr 20, 2024
Author

heyalexchoi
Mar 28, 2024