Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment Results' Reproduce using provided Checkpoint #3

Closed
GMago-LeWay opened this issue Nov 29, 2022 · 10 comments
Closed

Experiment Results' Reproduce using provided Checkpoint #3

GMago-LeWay opened this issue Nov 29, 2022 · 10 comments
Labels
bug Something isn't working Done The issue is fixed

Comments

@GMago-LeWay
Copy link

Hello!
I downloaded the trained checkpoint in README for inferring on the test set to reproduce the results.
The results given in README are (EM / F0.5 : 34.10 / 45.48). But my results (utilizing the run_stg_joint.sh) are (EM / F0.5 : 50.5 / 37.9). This difference cannot be neglected.
Actually, I adjusted some code while inferring.

  1. In line 16 of FCGEC/model/STG-correction/Model/tagger_model.py, I have to change the self.max_token = args.max_generate + 1 to self.max_token = args.max_generate. Otherwise, the parameter shape of self._hidden2t in the checkpoint cannot match the constructed model.
  2. In line 46 of FCGEC-main/model/STG-correction/preprocess_data.py. Some additional code needs to be added because the "uid" for every sentence is essential in the test process. Thus, an additional column of the key is added in test.csv and I copy it to stg_joint_test.xlsx. I used this excel to form the final submission. My results are in row GMago on the Codalab page of results.
@xlxwalex
Copy link
Owner

Hi, thank for your feedback! The responses to your question are as follows:

  1. There is a mistake here, we actually set max_generate (released checkpoint) to 5 during training phase. However, we calculated the distribution of the data after the rebuttal and thought that 6 would be more appropriate. Thus we recommend to re-run with max_generate =6. Thank you for pointing out the problem, I will update it in the README afterwards.
  2. This result looks a bit weird, I will download your submission on Colab and check it to find the problem. It may take some time, I will reply here after i check it.

@xlxwalex xlxwalex added the question Further information is requested label Nov 29, 2022
@xlxwalex
Copy link
Owner

xlxwalex commented Nov 29, 2022

Hello, I have identified the reason for performance difference in our codalab system. We are very sorry for the error of our scoring program. More details are shown below:

For correction metric calculation, we only compute the metric on erroneous sentences. Therefore, we need to filter out the correct sentences first (based on the error_flag attribute in golden label file). While developing the scoring program, I mistakenly employ the error_flag of the prediction file instead of the golden label file. Thus, resulting in an error for two metrics (corr_ex and corr_f0.5).

We have fixed the bug and you can submit the previous predict.zip file for re-testing, the results will be:
codalab_exp

Meanwhile, i have updated the py and bash file to add the uid into the output file.

Thank you for your feedback!!! If you cannot reproduce our performances in codalab, feel free to add the comments here.

@xlxwalex xlxwalex added bug Something isn't working Done The issue is fixed and removed question Further information is requested labels Nov 29, 2022
@xlxwalex xlxwalex pinned this issue Nov 29, 2022
@GMago-LeWay
Copy link
Author

I made a submission and now the results of the given checkpoint are consistent with README (EM / F0.5 : 34.10 / 45.48).
Thanks for your reply!

@kingfan1998
Copy link

Hello!
I downloaded checkpoint and pretrained_model, and modified it to my path, but I still get an error:
"joint_evaluate.py: error: argument --lm_path: expected one argument"
how to solve it.
Thanks!

@xlxwalex
Copy link
Owner

xlxwalex commented Apr 2, 2023

Hello! I downloaded checkpoint and pretrained_model, and modified it to my path, but I still get an error: "joint_evaluate.py: error: argument --lm_path: expected one argument" how to solve it. Thanks!

Hi, it seems you have used multiple values for lm_path, which stands for the path to the pre-trained language model. Can you share the complete bash script or the command?

@kingfan1998
Copy link

#!/bin/bash

Copyright 2022 The ZJU MMF Authors (Lvxiaowei Xu, Jianwang Wu, Jiawei Peng, Jiayu Fu and Ming Cai *).

Licensed under the Apache License, Version 2.0 (the "License");

you may not use this file except in compliance with the License.

You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

Train and Test for STG-Joint

Global Variable (!!! SHOULD ADAPT TO YOUR CONFIGURATION !!!)

CUDA_ID=1
SEED=2022
EPOCH=50
BATCH_SIZE=32
MAX_GENERATE=5 # MAX T
SPECIAL_MAPPING=false # More details can be found in ISSUE 10
CHECKPOINT_DIR=checkpoints

Roberta-base-chinese can be downloaded at https://github.com/ymcui/Chinese-BERT-wwm

#PLM_PATH=/datadisk2/xlxw/Resources/pretrained_models/roberta-base-chinese/ # pretrained-model path
PLM_PATH= /pretrained_models/chinese-roberta-wwm-ext/
OUTPUT_PATH=stg_joint_test.xlsx

JOINT_CHECK_DIR=1021_jointmodel_stg

STEP 1 - PREPROCESS DATASET

#DATA_BASE_DIR=dataset
#DATA_OUT_DIR=stg_joint
#DATA_TRAIN_FILE=FCGEC_train.json
#DATA_VALID_FILE=FCGEC_valid.json
#DATA_TEST_FILE=FCGEC_test.json

#python preprocess_data.py --mode normal --err_only True
#--data_dir ${DATA_BASE_DIR} --out_dir ${DATA_OUT_DIR}
#--train_file ${DATA_TRAIN_FILE} --valid_file ${DATA_VALID_FILE} --test_file ${DATA_TEST_FILE}

STEP 2 - TRAIN STG-Joint MODEL

#python joint_stg.py --mode train
#--gpu_id ${CUDA_ID}
#--seed ${SEED}
#--checkpoints ${CHECKPOINT_DIR}
#--checkp ${JOINT_CHECK_DIR}
#--data_base_dir ${DATA_BASE_DIR}/${DATA_OUT_DIR}
#--lm_path ${PLM_PATH}
#--batch_size ${BATCH_SIZE}
#--epoch ${EPOCH}
#--max_generate ${MAX_GENERATE}

STEP 3 - TRAIN STG-Joint MODEL

python joint_evaluate.py --mode test --gpu_id ${CUDA_ID} --seed ${SEED}
--checkpoints ${CHECKPOINT_DIR} --checkp ${JOINT_CHECK_DIR}
--export ${OUTPUT_PATH}
--data_base_dir ${DATA_BASE_DIR}/${DATA_OUT_DIR}
--max_generate ${MAX_GENERATE}
--lm_path ${PLM_PATH}
--batch_size ${BATCH_SIZE}
--sp_map ${SPECIAL_MAPPING}

run: sh run_stg_joint.sh

@xlxwalex
Copy link
Owner

xlxwalex commented Apr 2, 2023

#!/bin/bash

Copyright 2022 The ZJU MMF Authors (Lvxiaowei Xu, Jianwang Wu, Jiawei Peng, Jiayu Fu and Ming Cai *).

Licensed under the Apache License, Version 2.0 (the "License");

you may not use this file except in compliance with the License.

You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.

Train and Test for STG-Joint

Global Variable (!!! SHOULD ADAPT TO YOUR CONFIGURATION !!!)

CUDA_ID=1 SEED=2022 EPOCH=50 BATCH_SIZE=32 MAX_GENERATE=5 # MAX T SPECIAL_MAPPING=false # More details can be found in ISSUE 10 CHECKPOINT_DIR=checkpoints

Roberta-base-chinese can be downloaded at https://github.com/ymcui/Chinese-BERT-wwm

#PLM_PATH=/datadisk2/xlxw/Resources/pretrained_models/roberta-base-chinese/ # pretrained-model path PLM_PATH= /pretrained_models/chinese-roberta-wwm-ext/ OUTPUT_PATH=stg_joint_test.xlsx

JOINT_CHECK_DIR=1021_jointmodel_stg

STEP 1 - PREPROCESS DATASET

#DATA_BASE_DIR=dataset #DATA_OUT_DIR=stg_joint #DATA_TRAIN_FILE=FCGEC_train.json #DATA_VALID_FILE=FCGEC_valid.json #DATA_TEST_FILE=FCGEC_test.json

#python preprocess_data.py --mode normal --err_only True #--data_dir ${DATA_BASE_DIR} --out_dir ${DATA_OUT_DIR} #--train_file ${DATA_TRAIN_FILE} --valid_file ${DATA_VALID_FILE} --test_file ${DATA_TEST_FILE}

STEP 2 - TRAIN STG-Joint MODEL

#python joint_stg.py --mode train #--gpu_id ${CUDA_ID} #--seed ${SEED} #--checkpoints ${CHECKPOINT_DIR} #--checkp ${JOINT_CHECK_DIR} #--data_base_dir ${DATA_BASE_DIR}/${DATA_OUT_DIR} #--lm_path ${PLM_PATH} #--batch_size ${BATCH_SIZE} #--epoch ${EPOCH} #--max_generate ${MAX_GENERATE}

STEP 3 - TRAIN STG-Joint MODEL

python joint_evaluate.py --mode test --gpu_id ${CUDA_ID} --seed ${SEED} --checkpoints ${CHECKPOINT_DIR} --checkp ${JOINT_CHECK_DIR} --export ${OUTPUT_PATH} --data_base_dir ${DATA_BASE_DIR}/${DATA_OUT_DIR} --max_generate ${MAX_GENERATE} --lm_path ${PLM_PATH} --batch_size ${BATCH_SIZE} --sp_map ${SPECIAL_MAPPING}

run: sh run_stg_joint.sh

The error I can currently find is that you have commented out two parameters, DATA_BASE_DIR and DATA_OUT_DIR. This will cause joint_evaluate.py cannot run properly, but the configuration for lm_path seems to be correct. Have you considered taking out the line of joint_evaluate.py in the bash script and testing it in command line mode?

@kingfan1998
Copy link

Sorry!

In evaluate_joint_config.py, I forgot to modify the parameters # Pretrained Model Params
pretrained_args = ArgumentGroup(parser, 'pretrained', 'Pretrained Model Settings')
pretrained_args.add_arg('use_lm', bool, True, 'Whether Model Use Language Models')

 ############################
 # pretrained_args.add_arg('lm_path', str, '/datadisk2/xlxw/Resources/pretrained_models/roberta-base-chinese', 'Bert Pretrained Model Path')
 pretrained_args.add_arg('lm_path', str, './pretrained_models/chinese-roberta-wwm-ext/', 'Bert Pretrained Model Path')
############################

 pretrained_args.add_arg('lm_hidden_size', int, 768, 'HiddenSize of PLM')
 pretrained_args.add_arg('output_hidden_states', bool, True, 'Output PLM Hidden States')
 pretrained_args. add_arg('finetune', bool, True, 'Finetune Or Freeze')

But I encountered a new problem, Some weights of the model checkpoint at ./pretrained_models/chinese-roberta-wwm-ext/ were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.transform. LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform. dense.weight', 'cls.predictions.transform.LayerNorm.weight']

  • This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

RuntimeError: Error(s) in loading state_dict for JointModel:
size mismatch for tagger._hidden2t.linear.weight: copying a param with shape torch.Size([6, 768]) from checkpoint, the shape in current model is torch.Size([7, 768]).
size mismatch for tagger._hidden2t.linear.bias: copying a param with shape torch.Size([6]) from checkpoint, the shape in current model is torch.Size([7]).

The pre-trained language model uses https://huggingface.co/hfl/chinese-roberta-wwm-ext/

@kingfan1998
Copy link

Sorry!

In evaluate_joint_config.py, I forgot to modify the parameters # Pretrained Model Params pretrained_args = ArgumentGroup(parser, 'pretrained', 'Pretrained Model Settings') pretrained_args.add_arg('use_lm', bool, True, 'Whether Model Use Language Models')

 ############################
 # pretrained_args.add_arg('lm_path', str, '/datadisk2/xlxw/Resources/pretrained_models/roberta-base-chinese', 'Bert Pretrained Model Path')
 pretrained_args.add_arg('lm_path', str, './pretrained_models/chinese-roberta-wwm-ext/', 'Bert Pretrained Model Path')
############################

 pretrained_args.add_arg('lm_hidden_size', int, 768, 'HiddenSize of PLM')
 pretrained_args.add_arg('output_hidden_states', bool, True, 'Output PLM Hidden States')
 pretrained_args. add_arg('finetune', bool, True, 'Finetune Or Freeze')

But I encountered a new problem, Some weights of the model checkpoint at ./pretrained_models/chinese-roberta-wwm-ext/ were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.transform. LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform. dense.weight', 'cls.predictions.transform.LayerNorm.weight']

  • This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

RuntimeError: Error(s) in loading state_dict for JointModel: size mismatch for tagger._hidden2t.linear.weight: copying a param with shape torch.Size([6, 768]) from checkpoint, the shape in current model is torch.Size([7, 768]). size mismatch for tagger._hidden2t.linear.bias: copying a param with shape torch.Size([6]) from checkpoint, the shape in current model is torch.Size([7]).

The pre-trained language model uses https://huggingface.co/hfl/chinese-roberta-wwm-ext/

Thank you, I solved it. In the py file, set the Number of Tagger Classes to 5, and the Number of Max Token Generation to 5. Thank you very much

@xlxwalex
Copy link
Owner

xlxwalex commented Apr 2, 2023

Sorry!
In evaluate_joint_config.py, I forgot to modify the parameters # Pretrained Model Params pretrained_args = ArgumentGroup(parser, 'pretrained', 'Pretrained Model Settings') pretrained_args.add_arg('use_lm', bool, True, 'Whether Model Use Language Models')

 ############################
 # pretrained_args.add_arg('lm_path', str, '/datadisk2/xlxw/Resources/pretrained_models/roberta-base-chinese', 'Bert Pretrained Model Path')
 pretrained_args.add_arg('lm_path', str, './pretrained_models/chinese-roberta-wwm-ext/', 'Bert Pretrained Model Path')
############################

 pretrained_args.add_arg('lm_hidden_size', int, 768, 'HiddenSize of PLM')
 pretrained_args.add_arg('output_hidden_states', bool, True, 'Output PLM Hidden States')
 pretrained_args. add_arg('finetune', bool, True, 'Finetune Or Freeze')

But I encountered a new problem, Some weights of the model checkpoint at ./pretrained_models/chinese-roberta-wwm-ext/ were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.transform. LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform. dense.weight', 'cls.predictions.transform.LayerNorm.weight']

  • This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

RuntimeError: Error(s) in loading state_dict for JointModel: size mismatch for tagger._hidden2t.linear.weight: copying a param with shape torch.Size([6, 768]) from checkpoint, the shape in current model is torch.Size([7, 768]). size mismatch for tagger._hidden2t.linear.bias: copying a param with shape torch.Size([6]) from checkpoint, the shape in current model is torch.Size([7]).
The pre-trained language model uses https://huggingface.co/hfl/chinese-roberta-wwm-ext/

Thank you, I solved it. In the py file, set the Number of Tagger Classes to 5, and the Number of Max Token Generation to 5. Thank you very much

You're welcome.

@xlxwalex xlxwalex mentioned this issue Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Done The issue is fixed
Projects
None yet
Development

No branches or pull requests

3 participants