Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.00000 combineloss when using ch_PP-OCRv4_det_cml.yaml #11507

Closed
Sundragon1993 opened this issue Jan 18, 2024 · 3 comments
Closed

0.00000 combineloss when using ch_PP-OCRv4_det_cml.yaml #11507

Sundragon1993 opened this issue Jan 18, 2024 · 3 comments
Assignees

Comments

@Sundragon1993
Copy link

Sundragon1993 commented Jan 18, 2024

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:Ubuntu 20.04
  • 版本号/Version:Paddle: PaddleOCR:2.7.0 问题相关组件/Related components:
  • 运行指令/Command Code:Training phase
  • 完整报错/Complete Error Message:db_Student_loss_cbn: 0.000000, db_Student2_loss_cbn: 0.000000

我们提供了AceIssueSolver来帮助你解答问题,你是否想要它来解答(请填写yes/no)?/We provide AceIssueSolver to solve issues, do you want it? (Please write yes/no):

Dear team, I'm trying to train the cml config using ch_PP-OCRv4_det_cml.yaml on my custom dataset but somehow the combine loss is always get 0. The program works fine when using ch_PP-OCRv4_det_teacher.yaml and ch_PP-OCRv4_det_student.yaml.
Here is my config, I just modified the dataset path:

Global:
  debug: false
  use_gpu: true
  epoch_num: 500
  log_smooth_window: 20
  print_batch_step: 20
  save_model_dir: ./output/ch_PP-OCRv4-cml-v1
  save_epoch_step: 100
  eval_batch_step:
  - 0
  - 1000
  cal_metric_during_train: False
  checkpoints: null
  pretrained_model: null
  save_inference_dir: null
  use_visualdl: false
  infer_img: doc/imgs_en/img_10.jpg
  save_res_path: ./checkpoints/det_db/predicts_db.txt
  distributed: true
Architecture:
  name: DistillationModel
  algorithm: Distillation
  model_type: det
  Models:
    Student:
      model_type: det
      algorithm: DB
      Transform: null
      Backbone:
        name: PPLCNetV3
        scale: 0.75
        det: True
        pretrained: false
      Neck:
        name: RSEFPN
        out_channels: 96
        shortcut: true
      Head:
        name: DBHead
        k: 50
    Student2:
      pretrained: null
      model_type: det
      algorithm: DB
      Transform: null
      Backbone:
        name: PPLCNetV3
        scale: 0.75
        det: True
        pretrained: true
      Neck:
        name: RSEFPN
        out_channels: 96
        shortcut: true
      Head:
        name: DBHead
        k: 50
    Teacher:
      pretrained: /home/hoangdc/workspace/PaddleOCR/.paddleocr/models/teacher.pdparams
      freeze_params: true
      return_all_feats: false
      model_type: det
      algorithm: DB
      Backbone:
        name: ResNet_vd
        in_channels: 3
        layers: 50
      Neck:
        name: LKPAN
        out_channels: 256
      Head:
        name: DBHead
        kernel_list:
        - 7
        - 2
        - 2
        k: 50
Loss:
  name: CombinedLoss
  loss_config_list:
  - DistillationDilaDBLoss:
      weight: 1.0
      model_name_pairs:
      - - Student
        - Teacher
      - - Student2
        - Teacher
      key: maps
      balance_loss: true
      main_loss_type: DiceLoss
      alpha: 5
      beta: 10
      ohem_ratio: 3
  - DistillationDMLLoss:
      model_name_pairs:
      - Student
      - Student2
      maps_name: thrink_maps
      weight: 1.0
      key: maps
  - DistillationDBLoss:
      weight: 1.0
      model_name_list:
      - Student
      - Student2
      balance_loss: true
      main_loss_type: DiceLoss
      alpha: 5
      beta: 10
      ohem_ratio: 3
Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.001
    warmup_epoch: 2
  regularizer:
    name: L2
    factor: 5.0e-05
PostProcess:
  name: DistillationDBPostProcess
  model_name:
  - Student
  key: head_out
  thresh: 0.3
  box_thresh: 0.6
  max_candidates: 1000
  unclip_ratio: 1.0
Metric:
  name: DistillationMetric
  base_metric_name: DetMetric
  main_indicator: hmean
  key: Student
Train:
  dataset:
    name: SimpleDataSet
    data_dir: data/detDataYOLOLabel/INVOICE_IMEI
    label_file_list:
      - data/detDataYOLOLabel/paddle_annotations_vSBT/Revised/dataset1.txt
      - data/detDataYOLOLabel/paddle_annotations_vSBT/Revised/dataset2.txt
      - data/detDataYOLOLabel/paddle_annotations_vSBT/Revised/dataset3.txt
    ratio_list: [ 1.0,1.0,0.75]
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - IaaAugment:
        augmenter_args:
        - type: Fliplr
          args:
            p: 0.5
        - type: Affine
          args:
            rotate:
            - -10
            - 10
        - type: Resize
          args:
            size:
            - 0.5
            - 3
    - EastRandomCropData:
        size:
        - 640
        - 640
        max_tries: 50
        keep_ratio: true
    - MakeBorderMap:
        shrink_ratio: 0.4
        thresh_min: 0.3
        thresh_max: 0.7
        total_epoch: 500
    - MakeShrinkMap:
        shrink_ratio: 0.4
        min_text_size: 8
        total_epoch: 500
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image
        - threshold_map
        - threshold_mask
        - shrink_map
        - shrink_mask
  loader:
    shuffle: true
    drop_last: false
    batch_size_per_card: 30
    num_workers: 8
Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/hoangdc/workspace/hoangdc/data/detDataYOLOLabel
    label_file_list:
      - data/detDataYOLOLabel/val.txt
    transforms:
    - DecodeImage:
        img_mode: BGR
        channel_first: false
    - DetLabelEncode: null
    - DetResizeForTest:
          limit_side_len: 320
          limit_type: min
    - NormalizeImage:
        scale: 1./255.
        mean:
        - 0.485
        - 0.456
        - 0.406
        std:
        - 0.229
        - 0.224
        - 0.225
        order: hwc
    - ToCHWImage: null
    - KeepKeys:
        keep_keys:
        - image
        - shape
        - polys
        - ignore_tags
  loader:
    shuffle: false
    drop_last: false
    batch_size_per_card: 1
    num_workers: 8
profiler_options: null


@Sundragon1993
Copy link
Author

Sundragon1993 commented Jan 18, 2024

I have also tried with:

- DetResizeForTest:
          limit_side_len: 960
          limit_type: max

But the error still persists.

@tink2123
Copy link
Collaborator

ch_PP-OCRv4_det_cml still has bugs to solve. Please use other config temporarily.

@yiakwy-xpu-ml-framework-team

@tink2123 det_cml is just teacher - student (x2) distillation (with KL loss to make the smaller students mimic the teacher output). So you can simply use det_student for baseline.

The bug has been fixed in this PR#11646

@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Jun 12, 2024
@SWHL SWHL converted this issue into discussion #13045 Jun 12, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants