Fine-tuning text detection for non-DB algorithms #13904

VishyAnand28 · 2024-09-25T07:48:49Z

VishyAnand28
Sep 25, 2024

Hello guys,

Fine-tuning of both text detection and recognition for default algorithms as found here:https://github.com/PaddlePaddle/PaddleOCR/blob/main/doc/doc_en/finetune_en.md is amazingly written and I was able to execute it. DB and SVTR_LCNet are default models for fine-tuning for detection and recognition respectively.

Brief Overview of models employed:
Text Detection: en_PP-OCRv3_det
Text Recognition: en_PP-OCRv4_rec
Corresponding models and yaml file can be found here: https://github.com/PaddlePaddle/PaddleOCR/blob/main/doc/doc_ch/models_list.md

I would like to know fine-tune with text detection SAST:

Could I simply use wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/pretrained/ResNet50_vd_ssld_pretrained.pdparams as pretrained model from here:https://github.com/PaddlePaddle/PaddleOCR/blob/main/doc/doc_en/detection_en.md with config file from here: https://github.com/PaddlePaddle/PaddleOCR/blob/main/configs/det/det_r50_vd_sast_icdar15.yml? (I want to fine-tune not train from scratch).
Fine-tuning of text recognition with VisionLAN. I have been exploring this for weeks but has left me confused on how to proceed.
2.1 As per https://github.com/PaddlePaddle/PaddleOCR/blob/main/doc/doc_en/recognition_en.md, we can download the pretrained model with wget -P ./pretrain_models/ https://paddleocr.bj.bcebos.com/PP-OCRv4/english/en_PP-OCRv4_rec_train.tar, which is the en_PP-OCRv4_rec model. This step is pretty clear.
2.2 But when I had a look at https://github.com/PaddlePaddle/PaddleOCR/blob/main/doc/doc_en/algorithm_rec_visionlan_en.md, it says that pretrained, trained model can be downloaded from this:

It is also mentioned on VisionLAN that PaddleOCR modularizes the code, and training different recognition models only requires changing the configuration file.

Information on 2.1 and 2,2 is bit confusing to me. Which model should I use to fine-tune on VisionLAN?

Thanks in advance for guidance!

@GreatV @WenmuZhou @LDOUBLEV @MissPenguin @tink2123 @UserWangZz ..... can someone please help me?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-tuning text detection for non-DB algorithms #13904

{{title}}

Replies: 0 comments

Select a reply

Fine-tuning text detection for non-DB algorithms #13904

VishyAnand28 Sep 25, 2024

Replies: 0 comments

VishyAnand28
Sep 25, 2024