You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to kindly ask that the training parameters used for training of the semantic segmentation EfficientViT models be released.
The paper only specifies that the AdamW optimizer was used and that a cosine learning rate decay was used, and I am having some trouble finding the right parameters to use to train the model well.
For example, it would be nice to know:
Batch size
Number of training iterations
Initial learning rate
If the learning rate was set differently for the backbone vs the SegHead
Settings for cosine lr scheduler. E.g., was a warmup used?
If weight decay was used
Augmentations used
Etc.
It would also be good to know if any of the hyper-params differed between training on ADE20K and training on Cityscapes.
Thank you for sharing this amazing work, it is so cool that such an elegant method can produce such powerful results.
The text was updated successfully, but these errors were encountered:
I would like to kindly ask that the training parameters used for training of the semantic segmentation EfficientViT models be released.
The paper only specifies that the AdamW optimizer was used and that a cosine learning rate decay was used, and I am having some trouble finding the right parameters to use to train the model well.
For example, it would be nice to know:
It would also be good to know if any of the hyper-params differed between training on ADE20K and training on Cityscapes.
Thank you for sharing this amazing work, it is so cool that such an elegant method can produce such powerful results.
The text was updated successfully, but these errors were encountered: