Dropout layers for Tesseract #4252

yaofuzhou · 2024-05-27T09:14:56Z

Your Feature Request

I am trying to implement the feature of dropout layers for Tesseract. For now, the hope is to enable something like, say, "Dr0.2" or so to the VGSLSpecs syntax. I implemented some of the code, but have encountered a few issues, and I figure this may be the place for discussion.

The files I have edited are

 Changes to be committed:
   (use "git restore --staged <file>..." to unstage)
 	new file:   ../src/lstm/dropout.cpp
 	new file:   ../src/lstm/dropout.h
 
 Changes not staged for commit:
   (use "git add <file>..." to update what will be committed)
   (use "git restore <file>..." to discard changes in working directory)
 	modified:   ../Makefile.am
 	modified:   ../configure.ac (for my own environment and irrelevant to the new dropout feature)
 	modified:   ../src/lstm/fullyconnected.cpp
 	modified:   ../src/lstm/network.cpp
 	modified:   ../src/lstm/network.h
 	modified:   ../src/training/common/networkbuilder.cpp
 	modified:   ../src/training/common/networkbuilder.h

The code compiles but cannot run

  ~/Documents/OCR/tesstrain_units_6 (main*) » make training
  make[1]: Entering directory '~/Documents/OCR/tesstrain_units_6'
  ~/Documents/OCR/tesseract_dr/build/combine_lang_model \
 	--input_unicharset data/units/unicharset \
 	--script_dir data/langdata \
 	--numbers data/units/units.numbers \
 	--puncs data/units/units.punc \
 	--words data/units/units.wordlist \
 	--output_dir data \
 	 \
 	--lang units
  dyld[91402]: symbol not found in flat namespace '__ZN9tesseract7Network11DeSerializeEPNS_5TFileE'
  make[1]: *** [dr_training.mk:40: data/units/units.traineddata] Abort trap: 6
  make[1]: Leaving directory '~/Documents/OCR/tesstrain_units_6'
  make: *** [Makefile:17: training] Error 2

This is not surprising, as I am sure there are additional and essential modifications needed on other parts of the codebase.

It is obvious that I need to be able to disable the dropout feature for the deployed .trainedmodels, for which I may need to further modify network.cpp. I need to ask the community about the best practice in terms of adding the new flag or switch for this purpose.
Ideally, I want to, when continuing training from a checkpoint, be able to adjust the dropout rate(s) to a different value(s), including setting it/them to 0 (perhaps when the training is converging). There is probably more than one way to do it, but I want to ask the community for the best practice.
Let me know when you want to go over my already implemented modifications (that do not work yet).

The text was updated successfully, but these errors were encountered:

amitdo · 2024-05-28T10:18:51Z

Let me know when you want to go over my already implemented modifications (that do not work yet).

I suggest to put it in a feature branch in your GitHub fork of Tesseract, so other people can see it.

amitdo · 2024-05-28T10:31:11Z

I reformatted your comment.

amitdo · 2024-05-28T10:37:20Z

CC @bertsky,

Maybe you can help @yaofuzhou with this new feature.

stweil · 2024-05-28T10:42:05Z

I just pushed my own unfinished efforts: https://github.com/stweil/tesseract/tree/dropout.

yaofuzhou · 2024-05-28T22:41:21Z

[Edited]

This is my implementation of the dropout feature so far -
https://github.com/yaofuzhou/tesseract
I have gone over @stweil 's code and it seems that we are trying to approach it in a very similar way.

There are aspects from @stweil 's code that I can learn from, and I will try to incorporate those into my code and give full credit to @stweil in the process.

My original description remains the same, namely -

My code compiles but does not run. Specifically, the tesseract and lstmtraining binaries yield the error messages

dyld[2292]: symbol not found in flat namespace '__ZN9tesseract7Network11DeSerializeEPNS_5TFileE'
[1]    2292 abort      ./lstmtraining
dyld[2292]: symbol not found in flat namespace '__ZN9tesseract7Network11DeSerializeEPNS_5TFileE'
[1]    2329 abort      ./tesseract

respectively, which means that I am probably missing something elsewhere in the Tesseract codebase. I tried to search for convolve and maxpool to see where these parallel components show up, but have not found the solution. This is probably where I need help the most.

I need to implement a flag/switch somewhere so that the dropout mechanism is only activated during the training process (running the lstmtraining binary) and not during normal usage (running the tesseract binary).
Ideally, I need to implement a mechanism to adjust the dropout_rate for each dropout layer when the lstmtraining binary continues from a checkpoint, as it may be desirable to turn off the dropout feature when the training converges to a good finish.

amitdo added RFC enhancement labels May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dropout layers for Tesseract #4252

Dropout layers for Tesseract #4252

yaofuzhou commented May 27, 2024 •

edited

Loading

amitdo commented May 28, 2024

amitdo commented May 28, 2024

amitdo commented May 28, 2024

stweil commented May 28, 2024

yaofuzhou commented May 28, 2024 •

edited

Loading

Dropout layers for Tesseract #4252

Dropout layers for Tesseract #4252

Comments

yaofuzhou commented May 27, 2024 • edited Loading

Your Feature Request

amitdo commented May 28, 2024

amitdo commented May 28, 2024

amitdo commented May 28, 2024

stweil commented May 28, 2024

yaofuzhou commented May 28, 2024 • edited Loading

yaofuzhou commented May 27, 2024 •

edited

Loading

yaofuzhou commented May 28, 2024 •

edited

Loading