Trying to finetune #195

AlexCohenDambros · 2025-03-03T14:07:40Z

I am also trying to do fine-tuning and encountered the same issues as mentioned in #174. I followed the steps outlined, replicating the code presented in #174, with the only change being the location where the data will be saved:

hf_train_ds.save_to_disk("test_database/store_ds")
hf_val_ds.save_to_disk("test_database/store_ds_eval")

I followed the same approach and created two YAML configuration files for training and validation. They are located respectively at cli/conf/finetune/data/data_fine_tuning.yaml and cli/conf/finetune/val_data/val_fine_tuning.yaml.

The training YAML file is:

_target_: uni2ts.data.builder.simple.SimpleDatasetBuilder
dataset: store_ds
storage_path: test_database
weight: 1000

And the validation YAML file is:

_target_: uni2ts.data.builder.ConcatDatasetBuilder
_args_:
  _target_: uni2ts.data.builder.simple.generate_eval_builders
  dataset: store_ds_eval
  storage_path: test_database
  offset: 95
  eval_length: 8
  prediction_lengths: [8]
  context_lengths: [16]
  patch_sizes: [16]

Then, I ran the training command as specified, changing only the file name:

python -m cli.train -cp conf/finetune run_name=fine_tuning_morai model=moirai_1.0_R_small data=data_fine_tuning val_data=val_fine_tuning

However, it generates the following error:

AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
...
assert time >= b > a >= 0
AssertionError

My questions:

Is it possible to perform fine-tuning of the model without passing validation data?
After performing the fine-tuning of the model, how do I load the model?

Thank you in advance for your help!

The text was updated successfully, but these errors were encountered:

zqiao11 · 2025-03-05T09:04:36Z

Hi. This error is caused by EvalCrop, indicating that the context/prediction window cannot be cropped under the current configuration as it would exceed the data bounds. You can check the values of fcst_start, a and b to identify the problem.

Additionally, validation data is optional—simply omit val_data when running python -m cli.train.

After fine-tuning, you can load the model with moirai_lightning_ckpt. Set checkpoint_path as your finetuned model located in output directory. You can refer to this script in PR #189.

AlexCohenDambros · 2025-03-06T14:40:40Z

Hi @zqiao11, firstly, thank you for your response.

I made the adjustment without providing validation data, using the following command:

CUDA_VISIBLE_DEVICES=0 python -m cli.train -cp conf/finetune run_name=fine_tuning_morai model=moirai_1.0_R_small data=data_fine_tuning val_data=val_fine_tuning

The training was executed with max_epochs set to 5.

As a result, I obtained an outputs folder, but it does not contain any checkpoint instance for the fine-tuned model. The directory in question is:

outputs/finetune/moirai_1.0_R_small/data_fine_tuning/fine_tuning_morai

Inside, it only contains a .hydra, a logs folder, and the train.log file.

To load the model I tried the following but it didn't work:

from uni2ts.model.moirai import MoiraiForecast
model = MoiraiForecast.load_from_checkpoint(
    checkpoint_path="outputs/finetune/moirai_1.1_R_small/data_fine_tuning/fine_tuning_morai",
    num_samples=100,
    patch_size=16,
    context_length=398
)

Finally, I also tried testing the code in PR #189 by executing the following command:

CUDA_VISIBLE_DEVICES=0 python -m cli.train -cp conf/finetune     exp_name=example_lsf     run_name=example_run     model=moirai_1.0_R_small     model.patch_size=32     model.context_length=1000     model.prediction_length=96     data=data_fine_tuning     data.patch_size=32     data.context_length=1000     data.prediction_length=96     data.mode=S

I also removed the validation data and changed the etth1 data to mine, but this resulted in the following error:

Error executing job with overrides: ['exp_name=example_lsf', 'run_name=example_run', 'model=moirai_1.0_R_small', 'model.patch_size=32', 'model.context_length=1000', 'model.prediction_length=96', 'data=data_fine_tuning', 'data.patch_size=32', 'data.context_length=1000', 'data.prediction_length=96', 'data.mode=S']
Error in call to target 'uni2ts.model.moirai.finetune.MoiraiFinetune':
TypeError("MoiraiFinetune.__init__() got an unexpected keyword argument 'patch_size'")
full_key: model

It is recommended to set the environment variable `HYDRA_FULL_ERROR=1` for a complete stack trace.

AlexCohenDambros added the bug Something isn't working label Mar 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to finetune #195

Trying to finetune #195

AlexCohenDambros commented Mar 3, 2025

zqiao11 commented Mar 5, 2025

AlexCohenDambros commented Mar 6, 2025

Trying to finetune #195

Trying to finetune #195

Comments

AlexCohenDambros commented Mar 3, 2025

zqiao11 commented Mar 5, 2025

AlexCohenDambros commented Mar 6, 2025