We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I have fine-tuned a BERT-NER model and on the eval_result.txt I got these values: P=0.608764 R=0.588080 F=0.594982
In my understanding these results come from the dev dataset (valid). While on the test set I got
processed 40982 tokens with 4577 phrases; found: 4645 phrases; correct: 4158. accuracy: 98.22%; precision: 89.52%; recall: 90.85%; FB1: 90.18 LOC: precision: 92.54%; recall: 92.54%; FB1: 92.54 1394 MISC: precision: 81.21%; recall: 82.31%; FB1: 81.76 676 ORG: precision: 84.54%; recall: 88.56%; FB1: 86.51 1255 PER: precision: 95.30%; recall: 95.45%; FB1: 95.38 1320
I'd like to understand the mismatch respecting the conll standard evaluation script.
The text was updated successfully, but these errors were encountered:
@paulthemagno Is the accuracy problem solved?
Sorry, something went wrong.
No branches or pull requests
I have fine-tuned a BERT-NER model and on the eval_result.txt I got these values:
P=0.608764
R=0.588080
F=0.594982
In my understanding these results come from the dev dataset (valid). While on the test set I got
I'd like to understand the mismatch respecting the conll standard evaluation script.
The text was updated successfully, but these errors were encountered: