-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dropout layers for Tesseract #4252
Comments
I suggest to put it in a feature branch in your GitHub fork of Tesseract, so other people can see it. |
I reformatted your comment. |
CC @bertsky, Maybe you can help @yaofuzhou with this new feature. |
I just pushed my own unfinished efforts: https://github.com/stweil/tesseract/tree/dropout. |
[Edited] This is my implementation of the dropout feature so far - There are aspects from @stweil 's code that I can learn from, and I will try to incorporate those into my code and give full credit to @stweil in the process. My original description remains the same, namely -
respectively, which means that I am probably missing something elsewhere in the Tesseract codebase. I tried to search for
|
Your Feature Request
I am trying to implement the feature of dropout layers for Tesseract. For now, the hope is to enable something like, say, "Dr0.2" or so to the VGSLSpecs syntax. I implemented some of the code, but have encountered a few issues, and I figure this may be the place for discussion.
This is not surprising, as I am sure there are additional and essential modifications needed on other parts of the codebase.
It is obvious that I need to be able to disable the dropout feature for the deployed
.trainedmodel
s, for which I may need to further modifynetwork.cpp
. I need to ask the community about the best practice in terms of adding the new flag or switch for this purpose.Ideally, I want to, when continuing training from a checkpoint, be able to adjust the dropout rate(s) to a different value(s), including setting it/them to 0 (perhaps when the training is converging). There is probably more than one way to do it, but I want to ask the community for the best practice.
Let me know when you want to go over my already implemented modifications (that do not work yet).
The text was updated successfully, but these errors were encountered: