Replies: 1 comment 2 replies
-
Hi @aiden890, I will try my best to answer questions that I know.
Potential issues for poor results:
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi!
I am trying to fine-tune recognition model to recognize spaced special characters.
I trained the model with 30,000 generated images
It appears that the pre-trained parameters are not being loaded successfully. Although there are no warnings like “pretrained parameter not in the model,” the accuracy starts from 0 and the loss exceeds 150. Furthermore, even the words that were previously recognized well are now experiencing a decrease in recognition accuracy
Here's my config file and train log
Global:
debug: false
use_gpu: true
epoch_num: 50
log_smooth_window: 20
print_batch_step: 10
save_model_dir: ./output/rec_ppocr_v4
save_epoch_step: 10
eval_batch_step:
cal_metric_during_train: true
pretrained_model: ./pretrained_models/en_PP-OCRv4_rec_train.pdparams
checkpoints: null
save_inference_dir: null
use_visualdl: false
infer_img: doc/imgs_words/ch/word_1.jpg
character_dict_path: ppocr/utils/en_dict.txt
max_text_length: 25
infer_mode: false
use_space_char: true
distributed: false
save_res_path: ./output/rec/predicts_ppocrv3_en.txt
Optimizer:
name: Adam
beta1: 0.9
beta2: 0.999
lr:
name: Cosine
learning_rate: 0.0005
warmup_epoch: 5
regularizer:
name: L2
factor: 3.0e-05
Architecture:
model_type: rec
algorithm: SVTR_LCNet
Transform: null
Backbone:
name: PPLCNetV3
scale: 0.95
Head:
name: MultiHead
head_list:
Neck:
name: svtr
dims: 120
depth: 2
hidden_dims: 120
kernel_size:
- 1
- 3
use_guide: true
Head:
fc_decay: 1.0e-05
nrtr_dim: 384
max_text_length: 25
Loss:
name: MultiLoss
loss_config_list:
PostProcess:
name: CTCLabelDecode
Metric:
name: RecMetric
main_indicator: acc
ignore_space: false
Train:
dataset:
name: MultiScaleDataSet
ds_width: false
data_dir: ./out_img/
ext_op_transform_idx: 1
label_file_list:
ratio_list:
transforms:
img_mode: BGR
channel_first: false
prob: 0.5
ext_data_num: 2
image_shape:
max_text_length: 25
gtc_encode: NRTRLabelEncode
keep_keys:
sampler:
name: MultiScaleSampler
scales:
first_bs: 96
fix_bs: false
divided_factor:
is_training: true
loader:
shuffle: true
batch_size_per_card: 128
drop_last: true
num_workers: 4
Eval:
dataset:
name: SimpleDataSet
data_dir: ./test_img
label_file_list:
transforms:
img_mode: BGR
channel_first: false
image_shape:
keep_keys:
loader:
shuffle: false
drop_last: false
batch_size_per_card: 128
num_workers: 4
profiler_options: null
train.log
My question is,
Thank you for your support!
Beta Was this translation helpful? Give feedback.
All reactions