cannot reproduce the baseline score of question answering with transformers v4.19.2 #9

kumapo · 2023-02-01T05:05:12Z

I tried to reproduce the baseline score with run_squad.py parameters you provided and patched transformers v4.19.2.
but the result score in eval_results.json is quite low compared to the baseline.

    "exact": 42.30076542098154,
    "f1": 42.390814948221525,

based on fune-tuning/README.md, I think you confirmed that transformers v4.19.2 worked.
How was the score then?

I'm attaching the requirements.txt and eval_results.json when I tested with transformers v4.19.2.

The text was updated successfully, but these errors were encountered:

tomohideshibata · 2023-02-01T06:49:15Z

Thank you for your report. I will check it.

Which pretrained model have you used?

kumapo · 2023-02-01T07:18:05Z

@tomohideshibata

Thank you for quick reply.
I used cl-tohoku/bert-base-japanese-v2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cannot reproduce the baseline score of question answering with transformers v4.19.2 #9

cannot reproduce the baseline score of question answering with transformers v4.19.2 #9

kumapo commented Feb 1, 2023

tomohideshibata commented Feb 1, 2023

kumapo commented Feb 1, 2023

cannot reproduce the baseline score of question answering with transformers v4.19.2 #9

cannot reproduce the baseline score of question answering with transformers v4.19.2 #9

Comments

kumapo commented Feb 1, 2023

tomohideshibata commented Feb 1, 2023

kumapo commented Feb 1, 2023