updated samples, documentation, and DDL for v2.0.0 release

tatonetti-lab · Nov 12, 2022 · 143022b · 143022b
1 parent 6b1a15d
commit 143022b
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -8,7 +8,7 @@ A resource of adverse drug effects extracted from FDA structured product labels.
 
 Second release of the OnSIDES database of adverse reactions and boxed warnings extracted from the FDA structured product labels (SPLs). This version contains significant model improvements as well as updated labels. All labels available to download from DailyMed (https://dailymed.nlm.nih.gov/dailymed/spl-resources-all-drug-labels.cfm) as of November 10, 2022 were processed in this analysis. In total XXX million adverse reactions were extracted from XX,000 labels for just under X,000 drug products (single agents or combinations).
 
-OnSIDES was created using the [PubMedBERT language model](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract) and 200 manually curated labels available from [Denmer-Fushman et al.](https://pubmed.ncbi.nlm.nih.gov/29381145/). The model achieves an F1 score of 0.90, AUROC of 0.92, and AUPR of 0.95 at extracting effects from the ADVERSE REACTIONS section of the label. This constitutes an absolute increase of 4% in each of the performance metrics over v1.0.0. For the BOXED WARNINGS section, the model achieves a F1 score of 0.78, AUROC of 0.83, and AUPR of 0.79. This constitutes an absolute increase of 10-17% in the performance metrics over v1.0.0. Compared against the TAC reference standard using the official evaluation script the model achieves a Micro-F1 score of 0.87 and a Macro-F1 of 0.85.
+OnSIDES was created using the [PubMedBERT language model](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract) and 200 manually curated labels available from [Denmer-Fushman et al.](https://pubmed.ncbi.nlm.nih.gov/29381145/). The model achieves an F1 score of 0.90, AUROC of 0.92, and AUPR of 0.95 at extracting effects from the ADVERSE REACTIONS section of the label. This constitutes an absolute increase of 4% in each of the performance metrics over v1.0.0. For the BOXED WARNINGS section, the model achieves a F1 score of 0.71, AUROC of 0.85, and AUPR of 0.72. This constitutes an absolute increase of 10-17% in the performance metrics over v1.0.0. Compared against the TAC reference standard using the official evaluation script the model achieves a Micro-F1 score of 0.87 and a Macro-F1 of 0.85.
 
 **Table 1. Performance metrics evaluated against the TAC gold standard**