-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
if model is convergence. what range dose total loss in?Did someone find a set of hyper parameters adapt a small batch size? #479
Comments
I've encountered a similar issue as well. Do you have any suggestions? |
I also trained my own dataset and found that the loss decreased rapidly, then became difficult to continue decreasing, but the final model performance was still good. Suggest using freeze fine-tuning or kNN to check if your model converges. Therefore, although the loss does not decrease, continuing to train the model should still be moving in a positive direction. |
Thank you for your suggestion. I will try to evaluate the effect through downstream testing. |
I have also encountered similar problems, and I found that perhaps because my images are not natural images, but medical images, removing the part of image preprocessing that normalizes based on the mean and variance of the ImageNet dataset, the loss can converge from 11 to around 4 |
I'm researching a similar scenario - medical images and training ViT using Dinov2 from scratch. I'm struggling to get training to stabilize - are there a range of hyperparameters you found worked well for you? |
I want to retrain dinov2 on my own domian-sepecific datasets with 220k unlabeled images. I reshape each image as [896,896] and train it on 4 A100-GPU, batch_size_per_gpu is 20.the config is as following:
and total loss just converge to 11 from 14. Is that because of bad hyper parameters?
The text was updated successfully, but these errors were encountered: