Repository for Benchmarking Scalable Predictive Uncertainty in Text Classification, by Jordy Van Landeghem, Matthew Blaschko, Bertrand Anckaert and Marie-Francine Moens.
It contains the source code, the experiments, and datasets used
in src
, experiments
, and datasets
respectively.
Perfect predictive accuracy is unattainable for most text classification problems, explaining the need for reliable ML solutions that can communicate predictive uncertainty when dealing with noisy or unknown inputs. In the quest for a simple, principled and scalable uncertainty method, which one to choose, when and why?
Our survey on Bayesian Deep Learning methods and benchmarking on 6 different text classification datasets aims to help practicioners make this decision and have future researchers spurred to continue investigations into hybrid uncertainty methods.
main file: experiment.py
Example command:
python3 experiment.py with CONFIG_NAME identifier=DATASET
@inproceedings{VanLandeghem2021,
TITLE = {Benchmarking Scalable Predictive Uncertainty in Text Classification},
AUTHOR = {Van Landeghem, Jordy and Blaschko, Matthew B. and Anckaert, Bertrand and Moens, Marie-Francine},
BOOKTITLE = {Submitted to ...},
YEAR = {2021}
}
The code was originally run in a corporate environment*, now reimplemented and open-sourced for aiding the research community. There will be small changes between the current output & results presented in the paper.