-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The issue of query loss not converging #5
Comments
We also found that the query tf fully connected layer is quite large, with nearly 90M parameters, which is more than 10 times the size of the prompt, and undoubtedly unacceptable. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
MD-DETR/engine.py
Lines 95 to 115 in 125e771
I found that the query loss is within the torch.no_grad(): block, which prevents the query loss from being included in the computation graph. As a result, the query loss cannot converge.
The text was updated successfully, but these errors were encountered: