About valid and test set result #75

paulthemagno · 2019-11-07T10:16:45Z

I have fine-tuned a BERT-NER model and on the eval_result.txt I got these values:
P=0.608764
R=0.588080
F=0.594982

In my understanding these results come from the dev dataset (valid). While on the test set I got

processed 40982 tokens with 4577 phrases; found: 4645 phrases; correct: 4158.
accuracy:  98.22%; precision:  89.52%; recall:  90.85%; FB1:  90.18
              LOC: precision:  92.54%; recall:  92.54%; FB1:  92.54  1394
             MISC: precision:  81.21%; recall:  82.31%; FB1:  81.76  676
              ORG: precision:  84.54%; recall:  88.56%; FB1:  86.51  1255
              PER: precision:  95.30%; recall:  95.45%; FB1:  95.38  1320

I'd like to understand the mismatch respecting the conll standard evaluation script.

The text was updated successfully, but these errors were encountered:

huanghonggit · 2020-06-22T04:34:28Z

@paulthemagno Is the accuracy problem solved?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About valid and test set result #75

About valid and test set result #75

paulthemagno commented Nov 7, 2019

huanghonggit commented Jun 22, 2020

About valid and test set result #75

About valid and test set result #75

Comments

paulthemagno commented Nov 7, 2019

huanghonggit commented Jun 22, 2020