Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flux] Add advanced training script + support textual inversion inference #9434

Merged

Conversation

linoytsaban
Copy link
Collaborator

@linoytsaban linoytsaban commented Sep 13, 2024

This PR adds an advanced version of the dreambooth lora flux script, with additional accompanying update to FluxPipline:

  • adds an advanced script with pivotal tuning feature for CLIP & T5 encoder

    • --train_text_encoder_ti (enables CLIP pivotal tuning)
    • --enable_t5_ti adds T5 to the mix
    • --train_text_encoder_ti_frac the portion of epochs to train embeddings on, when using just clip
    • --train_transformer_frac the portion of epochs to train the transformer on, train_transformer_frac==0 will trigger a "pure_textual_inversion" train run (= "classical" textual inversion, no optimization of transformer lora layers)
    • --initializer_token the token to init the textual inversion embeddings with instead of randomly (random by default).
  • --lora_blocks the blocks\layers to apply lora training on.

  • modifies Flux Pipeline (and related pipelines) to allow for textual inversion inference (that is also required by Loras trained with pivotal tuning)

motivation: to allow for fast iterations & experimental features. I think it'd be good to fork the canonical script and bring the above changes into the advanced folder in a similar manner to what we have for SDXL.

cc @apolinario

linoytsaban and others added 28 commits August 12, 2024 17:30
@linoytsaban
Copy link
Collaborator Author

@sayakpaul @apolinario wdyt about me closing #9160 and moving the changes here as I suggest above?

@sayakpaul
Copy link
Member

I would prefer #9160 as it helps to review the changes to the canonical script in isolation. Would that work for you?

@linoytsaban linoytsaban requested a review from sayakpaul October 14, 2024 19:26
Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@sayakpaul sayakpaul requested a review from yiyixuxu October 15, 2024 01:19
@sayakpaul
Copy link
Member

@yiyixuxu could you also give the changes in the pipelines a look? It's just about adding TextualInversionMixin so that we can enable pivotal tuning on Flux.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

Copy link
Collaborator

@apolinario apolinario left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go! 🚀

@linoytsaban linoytsaban merged commit 9a7f824 into huggingface:main Oct 17, 2024
15 checks passed
@linoytsaban linoytsaban deleted the dreambooth-lora-flux-exploration branch November 26, 2024 10:18
sayakpaul added a commit that referenced this pull request Dec 23, 2024
…ence (#9434)

* add ostris trainer to README & add cache latents of vae

* add ostris trainer to README & add cache latents of vae

* style

* readme

* add test for latent caching

* add ostris noise scheduler
https://github.com/ostris/ai-toolkit/blob/9ee1ef2a0a2a9a02b92d114a95f21312e5906e54/toolkit/samplers/custom_flowmatch_sampler.py#L95

* style

* fix import

* style

* fix tests

* style

* --change upcasting of transformer?

* update readme according to main

* add pivotal tuning for CLIP

* fix imports, encode_prompt call,add TextualInversionLoaderMixin to FluxPipeline for inference

* TextualInversionLoaderMixin support for FluxPipeline for inference

* move changes to advanced flux script, revert canonical

* add latent caching to canonical script

* revert changes to canonical script to keep it separate from #9160

* revert changes to canonical script to keep it separate from #9160

* style

* remove redundant line and change code block placement to align with logic

* add initializer_token arg

* add transformer frac for range support from pure textual inversion to the orig pivotal tuning

* support pure textual inversion - wip

* adjustments to support pure textual inversion and transformer optimization in only part of the epochs

* fix logic when using initializer token

* fix pure_textual_inversion_condition

* fix ti/pivotal loading of last validation run

* remove embeddings loading for ti in final training run (to avoid adding huggingface hub dependency)

* support pivotal for t5

* adapt pivotal for T5 encoder

* adapt pivotal for T5 encoder and support in flux pipeline

* t5 pivotal support + support fo pivotal for clip only or both

* fix param chaining

* fix param chaining

* README first draft

* readme

* readme

* readme

* style

* fix import

* style

* add fix from #9419

* add to readme, change function names

* te lr changes

* readme

* change concept tokens logic

* fix indices

* change arg name

* style

* dummy test

* revert dummy test

* reorder pivoting

* add warning in case the token abstraction is not the instance prompt

* experimental - wip - specific block training

* fix documentation and token abstraction processing

* remove transformer block specification feature (for now)

* style

* fix copies

* fix indexing issue when --initializer_concept has different amounts

* add if TextualInversionLoaderMixin to all flux pipelines

* style

* fix import

* fix imports

* address review comments - remove necessary prints & comments, use pin_memory=True, use free_memory utils, unify warning and prints

* style

* logger info fix

* make lora target modules configurable and change the default

* make lora target modules configurable and change the default

* style

* make lora target modules configurable and change the default, add notes to readme

* style

* add tests

* style

* fix repo id

* add updated requirements for advanced flux

* fix indices of t5 pivotal tuning embeddings

* fix path in test

* remove `pin_memory`

* fix filename of embedding

* fix filename of embedding

---------

Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: YiYi Xu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants