Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional quasirandom timesteps, zero terminal SNR, cosine schedule for SD models #138

Merged
merged 5 commits into from
May 13, 2024

Conversation

coryMosaicML
Copy link
Collaborator

This PR enables a few optional tweaks to the training process for SD style models that differ from the standard training recipe:

quasirandomness: Select timesteps quasi-randomly to reduce loss variance
zero_terminal_snr: Rescale the noise schedule to include zero SNR
beta_schedule: Allows one to select from scaled_linear (default), linear, or squaredcos_cap_v2 noise schedules.

By default these are the standard values for SD2 and SDXL.

Copy link
Contributor

@Landanjs Landanjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I had two questions about the quasirandom sampling, one clarification on the schedulers, and a nit.

I'm not familiar with Sobol sequences, so I don't have an intuition if this is the best way to do it, but I trust your decision! From the description of Sobol sequences it sounds like what we need

diffusion/models/models.py Outdated Show resolved Hide resolved
diffusion/models/stable_diffusion.py Outdated Show resolved Hide resolved
diffusion/models/stable_diffusion.py Outdated Show resolved Hide resolved
diffusion/models/stable_diffusion.py Outdated Show resolved Hide resolved
Copy link
Contributor

@Landanjs Landanjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coryMosaicML coryMosaicML merged commit 35fa48a into mosaicml:main May 13, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants