The second-moment loss (SML) is a novel training objective for dropout-based regression networks that yields improved uncertainty estimates. The code accompanies our paper "Second-Moment Loss: A Novel Regression Objective for Improved Uncertainties" and allows to reproduce the results therein.
- The full network (without dropout) predicts the output for a given input, whereas dropout sub-networks are trained to model uncertainty and prediction residuals
- Technically, this is achieved by adding a simple novel loss term to typical regression objectives like e.g. the mean squared error loss
- SML training reaches state-of-the-art performance in our empirical study and improves on it when considering data shift and worst-case performances
- SML can serve as a drop-in replacement for MC dropout on regression tasks without increasing computational requirements
Model training with the second-moment loss is compared to various approaches for uncertainty estimation:
- MC dropout (‚full‘ and last-layer variant),
- parametric uncertainty (i.e. additional output for variance, optimization of negative log-likelihood),
- parametric uncertainty with MC dropout,
- deep ensembles (variant with / without parametric uncertainty).
We used a conda environment on Linux Debian Version 9. Use sml.yml
to create this environment as follows:
conda env create --name sml --file=sml.yml
In order to rerun our evaluation, you need to add the following UCI datasets to the data
folder that is located in the same directory as the Jupyter notebooks.
-
concrete http://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+Strength Put
Concrete_Data.xls
into the data folder. -
energy https://archive.ics.uci.edu/ml/datasets/Energy+efficiency Put
ENB2012_data.xlsx
into the data folder. -
abalone https://archive.ics.uci.edu/ml/datasets/abalone Put
abalone.data
andabalone.names
into the data folder. -
naval https://archive.ics.uci.edu/ml/datasets/Condition+Based+Maintenance+of+Naval+Propulsion+Plants Extract
UCI CBM Dataset.zip
into the data folder such that the data folder has a subdirectoryUCI CBM Dataset
. -
power https://archive.ics.uci.edu/ml/datasets/Combined+Cycle+Power+Plant Extract the
CCPP.zip
file into the data folder such that the data folder has a subdirectoryCCPP
. -
protein https://archive.ics.uci.edu/ml/datasets/Physicochemical+Properties+of+Protein+Tertiary+Structure Put
CASP.csv
into the data folder. -
wine quality https://archive.ics.uci.edu/ml/datasets/wine+quality Put
winequality-red.csv
into the data folder. -
yacht http://archive.ics.uci.edu/ml/datasets/yacht+hydrodynamics Put
yacht_hydrodynamics.data
into the data folder. -
year https://archive.ics.uci.edu/ml/datasets/YearPredictionMSD Extract
YearPredictionMSD.txt.zip
into the data folder such that the data folder contains the fileYearPredictionMSD.txt
. -
superconduct https://archive.ics.uci.edu/ml/datasets/Superconductivty+Data Extract
superconduct.zip
into the data folder such that the data folder has a subdirectorysuperconduct
. -
boston, diabetes, california These are obtained using functions from
sklearn
(v0.23.1), namelyload_boston
,load_diabetes
,fetch_california_housing
from thesklearn.datasets
package.
The toy datasets "toy_hf" and "toy_modulated" are generated by calling load_dataset
(for argument id='toy_hf' and id='toy_modulated', respectively) in the model_training_and_evaluation.ipynb
notebook.
The code for the experiments on the UCI and toy datasets is provided in the Jupyter notebook model_training_and_evaluation.ipynb
. Make sure to have the conda environment activated. Then execute all cells except the last one.
In the last cell insert a path in line 24 (exp_dir= ...
). The path points to a directory exp_dir
in which experiment log files are stored. If the directory does not exist it will be created.
To run the training/evaluation for the hyperparameter study (beta parameter) replace the line
methods = available_methods
with
methods = { 'mc_mod_sml1', 'mc_mod_sml25', 'mc_mod_sml', 'mc_mod_sml75', 'mc_mod_sml90'}
This allows to start the training with the SML loss using different values for beta.
Run the last cell to start the experiment. After the training finishes, exp_dir
contains several subdirectories for each dataset.
Those subdirectories contain zipped dictionaries that contain the performance/uncertainty metrics on train/test data, see next section.
To visualize evaluation results, another notebook is provided: model_evaluation_and_plots.ipynb
.
-
First run cells 1-4.
-
To create the plots in Figure 12,
- first run cells 5, 6.
- Set the path in the
exp_dirs
list. It should point to theexp_dir
used when running the training/evaluation script. - Finally, run cell 7. The plots will be stored in the
./plots
directory.
-
To create the plots in Figure 2,
- first run cells 8, 9.
- In cell 10 change the
exp_dir
path accordingly. It should point toexp_dir
that contains a subdirectory with the "toy_modulated" dataset. - Run cell 10 to create the plots for "toy_modulated". They will be stored in
./plots/toy_modulated.pdf
.
-
In cell 11 change the
exp_dir
path accordingly. It should point toexp_dir
that contains a subdirectory with the toy_hf dataset.- Run cell 11 to create the plot for toy_hf. It will be stored to
./plots/toy_hf.pdf
.
- Run cell 11 to create the plot for toy_hf. It will be stored to
-
To create plots similar to Figures 3, 8, 9, 10,
-
first run cells 12, 13, 14.
-
In cell 15 change the
exp_dirs
list accordingly. It should contain paths toexp_dir
. -
Run cell 15 afterwards.
-
Run cell 18 to create a plot showing RMSE values for all datasets. Run cell 19 to include "SML_DE" in the plot.
-
Run cell 21 to create a plot showing NLL values for all datasets. Run cell 22 to include "SML_DE" in the plot.
-
Run cell 24 to create a plot showing ECE values for all datasets. Run cell 25 to include "SML_DE" in the plot.
-
Run cell 27 to create a plot showing Wasserstein distance values for all datasets. Run cell 28 to include SML_DE in the plot. All plots will be stored to the
./plots
directory.
-
-
To create the plots similar to Figures 13, 14,
- change the exp_dirs list of cell 29 accordingly. It should contain paths to EXP_DIR of a hyperparameter training/evaluation run as described above.
-
Then run cells 29-33 to create the plots for the hyperparameter study with regards to RMSE, NLL, ECE and Wasserstein distance. The plots will be stored in the
./plots
directory. -
To create the plot in Figure 15,
- first run cell 34.
- Again, change the exp_dirs list of cell 35 accordingly. It should contain paths to
exp_dir
of a standard training/evaluation run. - Then run cells 35, 38, 39, 40 to create the plots. They will be stored to the
./plots
directory.
Due to random train-test splits and random initializations, the obtained results might slightly deviate from those reported in the paper.
All experiments are conducted on a Intel(R) Xeon(R) Gold 6126
CPU @ 2.60GHz. For object detection we additionally used a Tesla V100
GPU.
Conducting the described experiments with cross validation takes 6h (on CPU) for toy data, 80h (on CPU) for UCI regression and 8h (on GPU) for object regression.
The SML code is released under the MIT license.
If you use or reference the second-moment loss in your research, please use the following BibTeX entry.
@article{sml,
author = {Joachim Sicking and Maram Akila and Maximilian Pintz and Tim Wirtz and Asja Fischer},
title = {Second-Moment Loss: A Novel Regression Objective for Improved Uncertainties},
year = {2020}
}