Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After upgrading and rerunning the setup, Flux finetuning from 14s/it into 164s/it #2874

veighnsche opened this issue Oct 1, 2024 · 1 comment


Copy link

veighnsche commented Oct 1, 2024

windows 10: RTX 3090 24GB, running kohya locally for FLux dream. I followed SECourses video to configure kohya. The last time I used Kohya was sept 1 2024. Now it is Okt 1 2024. First I ran Kohya without updating: the performance was 14s/it. Then I cancelled training. I ran git pull, reran the setup, configured the accelerator. And now the performance is 130-160s/it.

I didn't touch most of the parameters. Most config comes from a pre-made config file. What am I doing wrong?

(venv) PS D:\Kohya_GUI_Flux_Installer_21\kohya_ss> git status
On branch sd3-flux.1
Your branch is up to date with 'origin/sd3-flux.1'.
adaptive_noise_scale = 0
ae = "D:/ComfyUI_windows_portable/ComfyUI/models/vae/ae.safetensors"
blocks_to_swap = 0
bucket_no_upscale = true
bucket_reso_steps = 64
cache_latents = true
cache_latents_to_disk = true
cache_text_encoder_outputs = true
cache_text_encoder_outputs_to_disk = true
caption_dropout_every_n_epochs = 0
caption_dropout_rate = 0
caption_extension = ".txt"
clip_l = "D:/ComfyUI_windows_portable/ComfyUI/models/clip/clip_l.safetensors"
cpu_offload_checkpointing = true
discrete_flow_shift = 3.1582
double_blocks_to_swap = 5
dynamo_backend = "no"
epoch = 15
full_bf16 = true
fused_backward_pass = true
gradient_accumulation_steps = 1
gradient_checkpointing = true
guidance_scale = 1
huber_c = 0.1
huber_schedule = "snr"
keep_tokens = 0
learning_rate = 4e-6
learning_rate_te = 0
logging_dir = "D:/Kohya_GUI_Flux_Installer_21/train_4\\log"
loss_type = "l2"
lr_scheduler = "constant"
lr_scheduler_args = []
lr_scheduler_num_cycles = 1
lr_scheduler_power = 1
lr_warmup_steps = 0
max_bucket_reso = 2048
max_data_loader_n_workers = 0
max_timestep = 1000
max_token_length = 75
max_train_steps = 3300
mem_eff_save = true
min_bucket_reso = 256
mixed_precision = "bf16"
model_prediction_type = "raw"
multires_noise_discount = 0.3
multires_noise_iterations = 0
noise_offset = 0
noise_offset_type = "Original"
optimizer_args = [ "scale_parameter=False", "relative_step=False", "warmup_init=False", "weight_decay=0.01",]
optimizer_type = "Adafactor"
output_dir = "D:/Kohya_GUI_Flux_Installer_21/train_4\\model"
output_name = "alr_p"
persistent_data_loader_workers = 0
pretrained_model_name_or_path = "D:/Kohya_GUI_Flux_Installer_21/train_3/model/Alternate_reality_0-000005.safetensors"
prior_loss_weight = 1
resolution = "1024,1024"
resume = "D:/Kohya_GUI_Flux_Installer_21/train_3/model/Alternate_reality_0-000005-state"
sample_prompts = "D:/Kohya_GUI_Flux_Installer_21/train_4\\model\\sample/prompt.txt"
sample_sampler = "euler_a"
save_every_n_epochs = 3
save_model_as = "safetensors"
save_precision = "fp16"
save_state = true
save_state_on_train_end = true
sdpa = true
seed = 1
single_blocks_to_swap = 0
t5xxl = "D:/ComfyUI_windows_portable/ComfyUI/models/clip/t5xxl_fp16.safetensors"
t5xxl_max_token_length = 512
timestep_sampling = "sigmoid"
train_batch_size = 1
train_blocks = "all"
train_data_dir = "D:/Kohya_GUI_Flux_Installer_21/train_4\\img"
vae_batch_size = 4
wandb_run_name = "alr_p"
(venv) PS D:\Kohya_GUI_Flux_Installer_21\kohya_ss> pip list
Package                      Version                 Editable project location
---------------------------- ----------------------- --------------------------------------------------
absl-py                      2.1.0
accelerate                   0.33.0
aiofiles                     23.2.1
aiohappyeyeballs             2.4.3
aiohttp                      3.10.8
aiosignal                    1.3.1
altair                       4.2.2
annotated-types              0.7.0
antlr4-python3-runtime       4.9.3
anyio                        4.6.0
appdirs                      1.4.4
astunparse                   1.6.3
async-timeout                4.0.3
attrs                        24.2.0
bitsandbytes                 0.44.0
certifi                      2024.8.30
charset-normalizer           3.3.2
click                        8.1.7
colorama                     0.4.6
coloredlogs                  15.0.1
contourpy                    1.3.0
cycler                       0.12.1
dadaptation                  3.2
diffusers                    0.25.0
docker-pycreds               0.4.0
easygui                      0.98.3
einops                       0.7.0
entrypoints                  0.4
exceptiongroup               1.2.2
fairscale                    0.4.13
fastapi                      0.112.4
ffmpy                        0.4.0
filelock                     3.16.1
flatbuffers                  24.3.25
fonttools                    4.54.1
frozenlist                   1.4.1
fsspec                       2024.9.0
ftfy                         6.1.1
gast                         0.6.0
gitdb                        4.0.11
GitPython                    3.1.43
google-pasta                 0.2.0
gradio                       4.43.0
gradio_client                1.3.0
grpcio                       1.66.2
h11                          0.14.0
h5py                         3.12.1
httpcore                     1.0.5
httpx                        0.27.2
huggingface-hub              0.24.5
humanfriendly                10.0
idna                         3.10
imagesize                    1.4.1
importlib_metadata           8.5.0
importlib_resources          6.4.5
invisible-watermark          0.2.0
Jinja2                       3.1.4
jsonschema                   4.23.0
jsonschema-specifications    2023.12.1
keras                        3.5.0
kiwisolver                   1.4.7
libclang                     18.1.1
library                      0.0.0                   d:\kohya_gui_flux_installer_21\kohya_ss\sd-scripts
lightning-utilities          0.11.7
lion-pytorch                 0.0.6
lycoris-lora                 2.2.0.post3
Markdown                     3.7
markdown-it-py               3.0.0
MarkupSafe                   2.1.5
matplotlib                   3.9.2
mdurl                        0.1.2
ml-dtypes                    0.4.1
mpmath                       1.3.0
multidict                    6.1.0
namex                        0.0.8
networkx                     3.3
numpy                        1.26.4
omegaconf                    2.3.0
onnx                         1.16.1
onnxruntime-gpu              1.17.1
open-clip-torch              2.20.0
opt_einsum                   3.4.0
optree                       0.12.1
orjson                       3.10.7
packaging                    24.1
pandas                       2.2.3
pathtools                    0.1.2
pillow                       10.4.0
pip                          23.0.1
platformdirs                 4.3.6
prodigyopt                   1.0
protobuf                     3.20.3
psutil                       6.0.0
pydantic                     2.9.2
pydantic_core                2.23.4
pydub                        0.25.1
Pygments                     2.18.0
pyparsing                    3.1.4
pyreadline3                  3.5.4
python-dateutil              2.9.0.post0
python-multipart             0.0.12
pytorch-lightning            1.9.0
pytz                         2024.2
PyWavelets                   1.7.0
PyYAML                       6.0.2
referencing                  0.35.1
regex                        2024.9.11
requests                     2.32.3
rich                         13.8.1
rpds-py                      0.20.0
ruff                         0.6.8
safetensors                  0.4.4
schedulefree                 1.2.7
scipy                        1.11.4
semantic-version             2.10.0
sentencepiece                0.2.0
sentry-sdk                   2.14.0
setproctitle                 1.3.3
setuptools                   65.5.0
shellingham                  1.5.4
six                          1.16.0
smmap                        5.0.1
sniffio                      1.3.1
starlette                    0.38.6
sympy                        1.13.1
tensorboard                  2.17.1
tensorboard-data-server      0.7.2
tensorflow                   2.17.0
tensorflow-intel             2.17.0
tensorflow-io-gcs-filesystem 0.31.0
termcolor                    2.4.0
timm                         0.6.12
tk                           0.1.0
tokenizers                   0.19.1
toml                         0.10.2
tomlkit                      0.12.0
toolz                        0.12.1
torch                        2.4.1+cu124
torchaudio                   2.5.0.dev20240930+cu124
torchmetrics                 1.4.2
torchvision                  0.19.1+cu124
tqdm                         4.66.5
transformers                 4.44.2
typer                        0.12.5
typing_extensions            4.12.2
tzdata                       2024.2
urllib3                      2.2.3
uvicorn                      0.31.0
voluptuous                   0.13.1
wandb                        0.18.0
wcwidth                      0.2.13
websockets                   12.0
Werkzeug                     3.0.4
wheel                        0.44.0
wrapt                        1.16.0
xformers                     0.0.28.post1
yarl                         1.13.1
zipp                         3.20.2

[notice] A new release of pip is available: 23.0.1 -> 24.2
[notice] To update, run: python.exe -m pip install --upgrade pip
(venv) PS D:\Kohya_GUI_Flux_Installer_21\kohya_ss> nvidia-smi
Tue Oct  1 05:35:09 2024
| NVIDIA-SMI 560.81                 Driver Version: 560.81         CUDA Version: 12.6     |
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3090      WDDM  |   00000000:02:00.0 Off |                  N/A |
| 73%   63C    P2            204W /  370W |   24271MiB /  24576MiB |    100%      Default |
|                                         |                        |                  N/A |
|   1  NVIDIA GeForce RTX 3060      WDDM  |   00000000:03:00.0  On |                  N/A |
|  0%   59C    P8             23W /  170W |     973MiB /  12288MiB |      3%      Default |
|                                         |                        |                  N/A |
05:11:49-505054 INFO     headless: False
05:11:49-571035 INFO     Using shell=True when running external commands...
Running on local URL:

To create a public link, set `share=True` in `launch()`.
05:12:18-344000 INFO     Loading config...
05:12:23-768624 INFO     Start training Dreambooth...
05:12:23-770609 INFO     Validating lr scheduler arguments...
05:12:23-771608 INFO     Validating optimizer arguments...
05:12:23-772608 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\log existence and writability...
05:12:23-774626 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\model existence and writability...
05:12:23-776608 INFO     Validating
                         s existence... SUCCESS
05:12:23-777627 INFO     Validating
                         existence... SUCCESS
05:12:23-779835 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\img existence... SUCCESS
05:12:23-780608 INFO     Folder 1_ohwx woman: 1 repeats found
05:12:23-783608 INFO     Folder 1_ohwx woman: 220 images found
05:12:23-784607 INFO     Folder 1_ohwx woman: 220 * 1 = 220 steps
05:12:23-786609 INFO     Regularization factor: 1
05:12:23-787608 INFO     Total steps: 220
05:12:23-789609 INFO     Train batch size: 1
05:12:23-790608 INFO     Gradient accumulation steps: 1
05:12:23-791608 INFO     Epoch: 15
05:12:23-792608 INFO     max_train_steps (220 / 1 / 1 * 15 * 1) = 3300
05:12:23-794610 INFO     lr_warmup_steps = 0
05:12:23-799630 WARNING  Here is the trainer command as a reference. It will not be executed:

D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\Scripts\accelerate.EXE launch --dynamo_backend no --dynamo_mode default --gpu_ids 0 --mixed_precision bf16 --num_processes 1 --num_machines 1 --num_cpu_threads_per_process 2 D:/Kohya_GUI_Flux_Installer_21/kohya_ss/sd-scripts/ --config_file D:/Kohya_GUI_Flux_Installer_21/train_4\model/config_dreambooth-20241001-051223.toml

05:12:23-801630 INFO     Showing toml config file:

05:12:23-803628 INFO     adaptive_noise_scale = 0
                         ae = "D:/ComfyUI_windows_portable/ComfyUI/models/vae/ae.safetensors"
                         blocks_to_swap = 0
                         bucket_no_upscale = true
                         bucket_reso_steps = 64
                         cache_latents = true
                         cache_latents_to_disk = true
                         cache_text_encoder_outputs = true
                         cache_text_encoder_outputs_to_disk = true
                         caption_dropout_every_n_epochs = 0
                         caption_dropout_rate = 0
                         caption_extension = ".txt"
                         clip_l = "D:/ComfyUI_windows_portable/ComfyUI/models/clip/clip_l.safetensors"
                         cpu_offload_checkpointing = true
                         discrete_flow_shift = 3.1582
                         double_blocks_to_swap = 5
                         dynamo_backend = "no"
                         epoch = 15
                         full_bf16 = true
                         fused_backward_pass = true
                         gradient_accumulation_steps = 1
                         gradient_checkpointing = true
                         guidance_scale = 1
                         huber_c = 0.1
                         huber_schedule = "snr"
                         keep_tokens = 0
                         learning_rate = 4e-6
                         learning_rate_te = 0
                         logging_dir = "D:/Kohya_GUI_Flux_Installer_21/train_4\\log"
                         loss_type = "l2"
                         lr_scheduler = "constant"
                         lr_scheduler_args = []
                         lr_scheduler_num_cycles = 1
                         lr_scheduler_power = 1
                         lr_warmup_steps = 0
                         max_bucket_reso = 2048
                         max_data_loader_n_workers = 0
                         max_timestep = 1000
                         max_token_length = 75
                         max_train_steps = 3300
                         mem_eff_save = true
                         min_bucket_reso = 256
                         mixed_precision = "bf16"
                         model_prediction_type = "raw"
                         multires_noise_discount = 0.3
                         multires_noise_iterations = 0
                         noise_offset = 0
                         noise_offset_type = "Original"
                         optimizer_args = [ "scale_parameter=False", "relative_step=False", "warmup_init=False",
                         optimizer_type = "Adafactor"
                         output_dir = "D:/Kohya_GUI_Flux_Installer_21/train_4\\model"
                         output_name = "alr_p"
                         persistent_data_loader_workers = 0
                         pretrained_model_name_or_path =
                         prior_loss_weight = 1
                         resolution = "1024,1024"
                         resume =
                         sample_prompts = "D:/Kohya_GUI_Flux_Installer_21/train_4\\model\\sample/prompt.txt"
                         sample_sampler = "euler_a"
                         save_every_n_epochs = 3
                         save_model_as = "safetensors"
                         save_precision = "fp16"
                         save_state = true
                         save_state_on_train_end = true
                         sdpa = true
                         seed = 1
                         single_blocks_to_swap = 0
                         t5xxl_max_token_length = 512
                         timestep_sampling = "sigmoid"
                         train_batch_size = 1
                         train_blocks = "all"
                         train_data_dir = "D:/Kohya_GUI_Flux_Installer_21/train_4\\img"
                         vae_batch_size = 4
                         wandb_run_name = "alr_p"

05:12:23-814643 INFO     end of toml config file:
05:12:26-356066 INFO     Start training Dreambooth...
05:12:26-358065 INFO     Validating lr scheduler arguments...
05:12:26-360065 INFO     Validating optimizer arguments...
05:12:26-362065 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\log existence and writability...
05:12:26-363064 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\model existence and writability...
05:12:26-366065 INFO     Validating
                         s existence... SUCCESS
05:12:26-368064 INFO     Validating
                         existence... SUCCESS
05:12:26-370064 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\img existence... SUCCESS
05:12:26-372065 INFO     Folder 1_ohwx woman: 1 repeats found
05:12:26-377064 INFO     Folder 1_ohwx woman: 220 images found
05:12:26-379064 INFO     Folder 1_ohwx woman: 220 * 1 = 220 steps
05:12:26-381064 INFO     Regularization factor: 1
05:12:26-382064 INFO     Total steps: 220
05:12:26-384065 INFO     Train batch size: 1
05:12:26-386065 INFO     Gradient accumulation steps: 1
05:12:26-388065 INFO     Epoch: 15
05:12:26-389064 INFO     max_train_steps (220 / 1 / 1 * 15 * 1) = 3300
05:12:26-392065 INFO     lr_warmup_steps = 0
05:12:26-399065 INFO     Saving training config to
05:12:26-404065 INFO     Executing command: D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\Scripts\accelerate.EXE launch
                         --dynamo_backend no --dynamo_mode default --gpu_ids 0 --mixed_precision bf16 --num_processes 1
                         --num_machines 1 --num_cpu_threads_per_process 2
                         D:/Kohya_GUI_Flux_Installer_21/kohya_ss/sd-scripts/ --config_file
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\diffusers\utils\ FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\diffusers\utils\ FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
2024-10-01 05:12:51 INFO     Loading settings from D:/Kohya_GUI_Flux_Installer_21/train_4\model/config_dreambooth-20241001-051226.toml...                                      
                    INFO     D:/Kohya_GUI_Flux_Installer_21/train_4\model/config_dreambooth-20241001-051226                                                                    
2024-10-01 05:12:51 INFO     Using DreamBooth method.                                                                                                                                 
                    INFO     prepare images.                                                                                                                                         
                    INFO     get image size from name of cache files                                                                                                                 
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 220/220 [00:00<00:00, 399.19it/s]
2024-10-01 05:12:52 INFO     set image size from cache files: 220/220                                                                                                                
                    INFO     found directory D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman contains 220 image files                                                  
                    WARNING  No caption file found for 220 images. Training will continue without captions for these images. If class token exists, it will be used. /               
                             220枚の画像にキャプションファイルが見つかりませんでした。これらの画像についてはキャプションなしで学習を続行します。class tokenが存在する場合はそれを使います。
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1413--flux1-dev-952361136.jpg                                                             
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1610--xl_stoiq_duchaiten_00001_-298273446.jpg                                             
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1639--xl_stoiq_duchaiten_00001_-1106092064-1.jpg                                          
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1639--xl_stoiq_duchaiten_00001_-1106092064.jpg                                            
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1639--xl_stoiq_duchaiten_00001_-1106092065-1.jpg                                          
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1641--xl_stoiq_duchaiten_00001_-1106092067-1.jpg... and 215 more                          
                    INFO     220 train images with repeating.                                                                                                                        
                    INFO     0 reg images.                                                                                                                                           
                    WARNING  no regularization images / 正則化画像が見つかりませんでした                                                                                             
                    INFO     [Dataset 0]                                                                                                                                             
                               batch_size: 1
                               resolution: (1024, 1024)
                               enable_bucket: False
                               network_multiplier: 1.0

                               [Subset 0 of Dataset 0]
                                 image_dir: "D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman"
                                 image_count: 220
                                 num_repeats: 1
                                 shuffle_caption: False
                                 keep_tokens: 0
                                 caption_separator: ,
                                 secondary_separator: None
                                 enable_wildcard: False
                                 caption_dropout_rate: 0
                                 caption_dropout_every_n_epoches: 0
                                 caption_tag_dropout_rate: 0.0
                                 caption_prefix: None
                                 caption_suffix: None
                                 color_aug: False
                                 flip_aug: False
                                 face_crop_aug_range: None
                                 random_crop: False
                                 token_warmup_min: 1,
                                 token_warmup_step: 0,
                                 alpha_mask: False,
                                 is_reg: False
                                 class_tokens: ohwx woman
                                 caption_extension: .txt

                    INFO     [Dataset 0]                                                                                                                                             
                    INFO     loading image sizes.                                                                                                                                     
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 220/220 [00:00<?, ?it/s]
                    INFO     prepare dataset                                                                                                                                          
                    INFO     prepare accelerator                                                                                                                                      
accelerator device: cuda
                    INFO     Building AutoEncoder                                                                                                                                      
                    INFO     Loading state dict from D:/ComfyUI_windows_portable/ComfyUI/models/vae/ae.safetensors                                                                     
                    INFO     Loaded AE: <All keys matched successfully>                                                                                                                
2024-10-01 05:12:54 INFO     [Dataset 0]                                                                                                                                             
                    INFO     caching latents with caching strategy.                                                                                                                  
                    INFO     checking cache validity...                                                                                                                              
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 220/220 [00:00<00:00, 569.28it/s]
                    INFO     no latents to cache                                                                                                                                     
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\transformers\ FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue:
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in
2024-10-01 05:12:56 INFO     Building CLIP                                                                                                                                             
                    INFO     Loading state dict from D:/ComfyUI_windows_portable/ComfyUI/models/clip/clip_l.safetensors                                                               
                    INFO     Loaded CLIP: <All keys matched successfully>                                                                                                             
                    INFO     Loading state dict from None                                                                                                                             
Traceback (most recent call last):
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\sd-scripts\", line 993, in <module>
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\sd-scripts\", line 211, in train
    t5xxl = flux_utils.load_t5xxl(args.t5xxl, weight_dtype, "cpu", args.disable_mmap_load_safetensors)
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\sd-scripts\library\", line 216, in load_t5xxl
    sd = load_safetensors(ckpt_path, device=str(device), disable_mmap=disable_mmap, dtype=dtype)
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\sd-scripts\library\", line 39, in load_safetensors
    return load_file(path)  # prevent device invalid Error
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\safetensors\", line 313, in load_file
    with safe_open(filename, framework="pt", device=device) as f:
TypeError: argument 'filename': expected str, bytes or os.PathLike object, not NoneType
Traceback (most recent call last):
  File "C:\Python310\lib\", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python310\lib\", line 86, in _run_code
    exec(code, run_globals)
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\Scripts\accelerate.EXE\", line 7, in <module>
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\accelerate\commands\", line 48, in main
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\accelerate\commands\", line 1106, in launch_command
  File "D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\accelerate\commands\", line 704, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\\Kohya_GUI_Flux_Installer_21\\kohya_ss\\venv\\Scripts\\python.exe', 'D:/Kohya_GUI_Flux_Installer_21/kohya_ss/sd-scripts/', '--config_file', 'D:/Kohya_GUI_Flux_Installer_21/train_4\\model/config_dreambooth-20241001-051226.toml']' returned non-zero exit status 1.
05:12:59-047872 INFO     Training has ended.
05:14:10-841036 INFO     Save...
05:14:15-063648 INFO     Start training Dreambooth...
05:14:15-065647 INFO     Validating lr scheduler arguments...
05:14:15-066648 INFO     Validating optimizer arguments...
05:14:15-067644 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\log existence and writability... SUCCESS
05:14:15-070646 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\model existence and writability... SUCCESS
05:14:15-072647 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_3/model/Alternate_reality_0-000005.safetensors existence... SUCCESS
05:14:15-073647 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_3/model/Alternate_reality_0-000005-state existence... SUCCESS
05:14:15-075647 INFO     Validating D:/Kohya_GUI_Flux_Installer_21/train_4\img existence... SUCCESS
05:14:15-077644 INFO     Folder 1_ohwx woman: 1 repeats found
05:14:15-079647 INFO     Folder 1_ohwx woman: 220 images found
05:14:15-081647 INFO     Folder 1_ohwx woman: 220 * 1 = 220 steps
05:14:15-082647 INFO     Regularization factor: 1
05:14:15-083647 INFO     Total steps: 220
05:14:15-084646 INFO     Train batch size: 1
05:14:15-086645 INFO     Gradient accumulation steps: 1
05:14:15-087647 INFO     Epoch: 15
05:14:15-088647 INFO     max_train_steps (220 / 1 / 1 * 15 * 1) = 3300
05:14:15-090647 INFO     lr_warmup_steps = 0
05:14:15-095659 INFO     Saving training config to D:/Kohya_GUI_Flux_Installer_21/train_4\model\alr_p_20241001-051415.json...
05:14:15-098647 INFO     Executing command: D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\Scripts\accelerate.EXE launch --dynamo_backend no --dynamo_mode default --gpu_ids 0 --mixed_precision bf16
                         --num_processes 1 --num_machines 1 --num_cpu_threads_per_process 2 D:/Kohya_GUI_Flux_Installer_21/kohya_ss/sd-scripts/ --config_file
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\diffusers\utils\ FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\diffusers\utils\ FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
2024-10-01 05:14:39 INFO     Loading settings from D:/Kohya_GUI_Flux_Installer_21/train_4\model/config_dreambooth-20241001-051415.toml...                                      
                    INFO     D:/Kohya_GUI_Flux_Installer_21/train_4\model/config_dreambooth-20241001-051415                                                                    
2024-10-01 05:14:39 INFO     Using DreamBooth method.                                                                                                                                 
                    INFO     prepare images.                                                                                                                                         
                    INFO     get image size from name of cache files                                                                                                                 
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 220/220 [00:00<00:00, 405.83it/s]
                    INFO     set image size from cache files: 220/220                                                                                                                
                    INFO     found directory D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman contains 220 image files                                                  
                    WARNING  No caption file found for 220 images. Training will continue without captions for these images. If class token exists, it will be used. /               
                             220枚の画像にキャプションファイルが見つかりませんでした。これらの画像についてはキャプションなしで学習を続行します。class tokenが存在する場合はそれを使います。
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1413--flux1-dev-952361136.jpg                                                             
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1610--xl_stoiq_duchaiten_00001_-298273446.jpg                                             
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1639--xl_stoiq_duchaiten_00001_-1106092064-1.jpg                                          
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1639--xl_stoiq_duchaiten_00001_-1106092064.jpg                                            
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1639--xl_stoiq_duchaiten_00001_-1106092065-1.jpg                                          
                    WARNING  D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman\1641--xl_stoiq_duchaiten_00001_-1106092067-1.jpg... and 215 more                          
                    INFO     220 train images with repeating.                                                                                                                        
                    INFO     0 reg images.                                                                                                                                           
                    WARNING  no regularization images / 正則化画像が見つかりませんでした                                                                                             
                    INFO     [Dataset 0]                                                                                                                                             
                               batch_size: 1
                               resolution: (1024, 1024)
                               enable_bucket: False
                               network_multiplier: 1.0

                               [Subset 0 of Dataset 0]
                                 image_dir: "D:\Kohya_GUI_Flux_Installer_21\train_4\img\1_ohwx woman"
                                 image_count: 220
                                 num_repeats: 1
                                 shuffle_caption: False
                                 keep_tokens: 0
                                 caption_separator: ,
                                 secondary_separator: None
                                 enable_wildcard: False
                                 caption_dropout_rate: 0
                                 caption_dropout_every_n_epoches: 0
                                 caption_tag_dropout_rate: 0.0
                                 caption_prefix: None
                                 caption_suffix: None
                                 color_aug: False
                                 flip_aug: False
                                 face_crop_aug_range: None
                                 random_crop: False
                                 token_warmup_min: 1,
                                 token_warmup_step: 0,
                                 alpha_mask: False,
                                 is_reg: False
                                 class_tokens: ohwx woman
                                 caption_extension: .txt

                    INFO     [Dataset 0]                                                                                                                                             
                    INFO     loading image sizes.                                                                                                                                     
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 220/220 [00:00<00:00, 220383.78it/s]
                    INFO     prepare dataset                                                                                                                                          
                    INFO     prepare accelerator                                                                                                                                      
accelerator device: cuda
                    INFO     Building AutoEncoder                                                                                                                                      
2024-10-01 05:14:40 INFO     Loading state dict from D:/ComfyUI_windows_portable/ComfyUI/models/vae/ae.safetensors                                                                     
                    INFO     Loaded AE: <All keys matched successfully>                                                                                                                
                    INFO     [Dataset 0]                                                                                                                                             
                    INFO     caching latents with caching strategy.                                                                                                                  
                    INFO     checking cache validity...                                                                                                                              
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 220/220 [00:00<00:00, 2050.71it/s]
                    INFO     no latents to cache                                                                                                                                     
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\transformers\ FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be depracted in transformers v4.45, and will be then set to `False` by default. For more details check this issue:
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in
2024-10-01 05:14:41 INFO     Building CLIP                                                                                                                                             
                    INFO     Loading state dict from D:/ComfyUI_windows_portable/ComfyUI/models/clip/clip_l.safetensors                                                               
                    INFO     Loaded CLIP: <All keys matched successfully>                                                                                                             
                    INFO     Loading state dict from D:/ComfyUI_windows_portable/ComfyUI/models/clip/t5xxl_fp16.safetensors                                                           
2024-10-01 05:14:42 INFO     Loaded T5xxl: <All keys matched successfully>                                                                                                            
2024-10-01 05:15:10 INFO     [Dataset 0]                                                                                                                                             
                    INFO     caching Text Encoder outputs with caching strategy.                                                                                                     
                    INFO     checking cache validity...                                                                                                                              
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 220/220 [00:00<00:00, 563.95it/s]
2024-10-01 05:15:11 INFO     no Text Encoder outputs to cache                                                                                                                        
                    INFO     cache Text Encoder outputs for sample prompt: D:/Kohya_GUI_Flux_Installer_21/train_4\model\sample/prompt.txt                                       
                    INFO     Building Flux model dev                                                                                                                                   
                    INFO     Loading state dict from D:/Kohya_GUI_Flux_Installer_21/train_3/model/Alternate_reality_0-000005.safetensors                                  
                    INFO     Loaded Flux: <All keys matched successfully>                                                                                                              
FLUX: Gradient checkpointing enabled. CPU offload: True
number of trainable parameters: 11901408320
prepare optimizer, data loader etc.
                    INFO     use Adafactor optimizer | {'scale_parameter': False, 'relative_step': False, 'warmup_init': False, 'weight_decay': 0.01}                                
                    WARNING  because max_grad_norm is set, clip_grad_norm is enabled. consider set to 0 /                                                                            
                    WARNING  constant_with_warmup will be good / スケジューラはconstant_with_warmupが良いかもしれません                                                              
enable full bf16 training.
2024-10-01 05:17:43 INFO     resume training from local state: D:/Kohya_GUI_Flux_Installer_21/train_3/model/Alternate_reality_0-000005-state                            
                    INFO     Loading states from D:/Kohya_GUI_Flux_Installer_21/train_3/model/Alternate_reality_0-000005-state                                         
2024-10-01 05:18:55 INFO     All model weights loaded successfully                                                                                                                 
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\accelerate\ FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  optimizer_state = torch.load(input_optimizer_file, map_location=map_location)
                    INFO     All optimizer states loaded successfully                                                                                                              
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\accelerate\ FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
                    INFO     All scheduler states loaded successfully                                                                                                              
                    INFO     All dataloader sampler states loaded successfully                                                                                                     
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\accelerate\ FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  states = torch.load(input_dir.joinpath(f"{RNG_STATE_NAME}_{process_index}.pkl"))
                    INFO     All random states loaded successfully                                                                                                                 
                    INFO     Loading in 0 custom states                                                                                                                             
running training / 学習開始
  num examples / サンプル数: 220
  num batches per epoch / 1epochのバッチ数: 220
  num epochs / epoch数: 15
  batch size per device / バッチサイズ: 1
  gradient accumulation steps / 勾配を合計するステップ数 = 1
  total optimization steps / 学習ステップ数: 3300
steps:   0%|                                                                                                                                                                           | 0/3300 [00:00<?, ?it/s]
epoch 1/15
2024-10-01 05:18:56 INFO     epoch is incremented. current_epoch: 0, epoch: 1                                                                                                         
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\sd-scripts\library\ UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.)
  x = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=attn_mask)
D:\Kohya_GUI_Flux_Installer_21\kohya_ss\venv\lib\site-packages\torch\utils\ FutureWarning: `torch.cpu.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cpu', args...)` instead.
  with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context:  # type: ignore[attr-defined]
steps:   0%|▍                                                                                                                                             | 10/3300 [21:25<117:29:44, 128.57s/it, avr_loss=0.47]
@veighnsche veighnsche changed the title After upgrading and rerunning the setup, Flux finetuning from14s/it turned into 164s/it After upgrading and rerunning the setup, Flux finetuning from 14s/it into 164s/it Oct 1, 2024
Copy link

I rolled back to 95bf7ff. that seems to work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

No branches or pull requests

1 participant