Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning with different max_seq_len and patch_size #182

Open
bkoyuncu opened this issue Feb 6, 2025 · 1 comment
Open

Finetuning with different max_seq_len and patch_size #182

bkoyuncu opened this issue Feb 6, 2025 · 1 comment

Comments

@bkoyuncu
Copy link

bkoyuncu commented Feb 6, 2025

Discussed in #181

Originally posted by bkoyuncu February 6, 2025
Hi, I want to load a pretrained model however with different max_seq_len and path_sizes to be used. I have realized it is not enough to change the arguments in for example moirai_1.0_R_small.yaml since the new config is being generated by huggingface hub.

Is there a better approach?

# load a pretrained checkpoint from huggingface hub
_target_: uni2ts.model.moirai.MoiraiFinetune
module:
  _target_: uni2ts.model.moirai.MoiraiModule.from_pretrained
  pretrained_model_name_or_path: Salesforce/moirai-1.0-R-small
module_kwargs:
  _target_: builtins.dict
  distr_output:
    _target_: uni2ts.distribution.MixtureOutput
    components:
      - _target_: uni2ts.distribution.StudentTOutput
      - _target_: uni2ts.distribution.NormalFixedScaleOutput
      - _target_: uni2ts.distribution.NegativeBinomialOutput
      - _target_: uni2ts.distribution.LogNormalOutput
  d_model: 384
  num_layers: 6
  patch_sizes: ${as_tuple:[8, 16, 32, 64, 128]} # <--------- Change does not effect model
  max_seq_len: 512# <--------- Change does not effect model
  attn_dropout_p: 0.0
  dropout_p: 0.0
  scaling: true
min_patches: 2
min_mask_ratio: 0.15
max_mask_ratio: 0.5
max_dim: 128
loss_func:
  _target_: uni2ts.loss.packed.PackedNLLLoss
val_metric:
  - _target_: uni2ts.loss.packed.PackedMSELoss
  - _target_: uni2ts.loss.packed.PackedNRMSELoss
    normalize: absolute_target_squared
lr: 1e-3
weight_decay: 1e-1
beta1: 0.9
beta2: 0.98
num_training_steps: ${mul:${trainer.max_epochs},${train_dataloader.num_batches_per_epoch}}
num_warmup_steps: 0
```</div>
@chenghaoliu89
Copy link
Contributor

Hi @bkoyuncu you are right, now it directly uses model config. You can modify the default_train_transform from finetune.py:

  1. Change patch size:
GetPatchSize(
                    min_time_patches=self.hparams.min_patches,
                    target_field="target",
                    patch_sizes=self.module.patch_sizes, # --> change to your expected patch size
                    patch_size_constraints=DefaultPatchSizeConstraints(),
                    offset=True,
                )
  1. max_seq_length:
PatchCrop(
                    min_time_patches=self.hparams.min_patches,
                    max_patches=self.module.max_seq_len, # change --> change to your expected max_seq_length
                    will_flatten=True,
                    offset=True,
                    fields=("target",),
                    optional_fields=("past_feat_dynamic_real",),
                )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants