Finetuning with different max_seq_len and patch_size #182

bkoyuncu · 2025-02-06T13:03:54Z

Discussed in #181

^{Originally posted by bkoyuncu February 6, 2025}
Hi, I want to load a pretrained model however with different max_seq_len and path_sizes to be used. I have realized it is not enough to change the arguments in for example moirai_1.0_R_small.yaml since the new config is being generated by huggingface hub.

Is there a better approach?

# load a pretrained checkpoint from huggingface hub
_target_: uni2ts.model.moirai.MoiraiFinetune
module:
  _target_: uni2ts.model.moirai.MoiraiModule.from_pretrained
  pretrained_model_name_or_path: Salesforce/moirai-1.0-R-small
module_kwargs:
  _target_: builtins.dict
  distr_output:
    _target_: uni2ts.distribution.MixtureOutput
    components:
      - _target_: uni2ts.distribution.StudentTOutput
      - _target_: uni2ts.distribution.NormalFixedScaleOutput
      - _target_: uni2ts.distribution.NegativeBinomialOutput
      - _target_: uni2ts.distribution.LogNormalOutput
  d_model: 384
  num_layers: 6
  patch_sizes: ${as_tuple:[8, 16, 32, 64, 128]} # <--------- Change does not effect model
  max_seq_len: 512# <--------- Change does not effect model
  attn_dropout_p: 0.0
  dropout_p: 0.0
  scaling: true
min_patches: 2
min_mask_ratio: 0.15
max_mask_ratio: 0.5
max_dim: 128
loss_func:
  _target_: uni2ts.loss.packed.PackedNLLLoss
val_metric:
  - _target_: uni2ts.loss.packed.PackedMSELoss
  - _target_: uni2ts.loss.packed.PackedNRMSELoss
    normalize: absolute_target_squared
lr: 1e-3
weight_decay: 1e-1
beta1: 0.9
beta2: 0.98
num_training_steps: ${mul:${trainer.max_epochs},${train_dataloader.num_batches_per_epoch}}
num_warmup_steps: 0
```</div>

The text was updated successfully, but these errors were encountered:

chenghaoliu89 · 2025-02-07T14:17:40Z

Hi @bkoyuncu you are right, now it directly uses model config. You can modify the default_train_transform from finetune.py:

Change patch size:

GetPatchSize(
                    min_time_patches=self.hparams.min_patches,
                    target_field="target",
                    patch_sizes=self.module.patch_sizes, # --> change to your expected patch size
                    patch_size_constraints=DefaultPatchSizeConstraints(),
                    offset=True,
                )

max_seq_length:

PatchCrop(
                    min_time_patches=self.hparams.min_patches,
                    max_patches=self.module.max_seq_len, # change --> change to your expected max_seq_length
                    will_flatten=True,
                    offset=True,
                    fields=("target",),
                    optional_fields=("past_feat_dynamic_real",),
                )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetuning with different max_seq_len and patch_size #182

Finetuning with different max_seq_len and patch_size #182

bkoyuncu commented Feb 6, 2025

chenghaoliu89 commented Feb 7, 2025

Finetuning with different max_seq_len and patch_size #182

Finetuning with different max_seq_len and patch_size #182

Comments

bkoyuncu commented Feb 6, 2025

Discussed in #181

chenghaoliu89 commented Feb 7, 2025