Skip to content

The second-moment loss (SML) is a novel training objective for dropout-based regression networks that yields improved uncertainty estimates.

License

Notifications You must be signed in to change notification settings

fraunhofer-iais/second-moment-loss

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Second-Moment Loss

The second-moment loss (SML) is a novel training objective for dropout-based regression networks that yields improved uncertainty estimates. The code accompanies our paper "Second-Moment Loss: A Novel Regression Objective for Improved Uncertainties" and allows to reproduce the results therein.

Overview

  • The full network (without dropout) predicts the output for a given input, whereas dropout sub-networks are trained to model uncertainty and prediction residuals
  • Technically, this is achieved by adding a simple novel loss term to typical regression objectives like e.g. the mean squared error loss
  • SML training reaches state-of-the-art performance in our empirical study and improves on it when considering data shift and worst-case performances
  • SML can serve as a drop-in replacement for MC dropout on regression tasks without increasing computational requirements

Baselines

Model training with the second-moment loss is compared to various approaches for uncertainty estimation:

  • MC dropout (‚full‘ and last-layer variant),
  • parametric uncertainty (i.e. additional output for variance, optimization of negative log-likelihood),
  • parametric uncertainty with MC dropout,
  • deep ensembles (variant with / without parametric uncertainty).

Installation

Conda Virtual Environment

We used a conda environment on Linux Debian Version 9. Use sml.yml to create this environment as follows:

conda env create --name sml --file=sml.yml

Datasets

In order to rerun our evaluation, you need to add the following UCI datasets to the data folder that is located in the same directory as the Jupyter notebooks.

The toy datasets "toy_hf" and "toy_modulated" are generated by calling load_dataset (for argument id='toy_hf' and id='toy_modulated', respectively) in the model_training_and_evaluation.ipynb notebook.

Quick Start

Model Training

The code for the experiments on the UCI and toy datasets is provided in the Jupyter notebook model_training_and_evaluation.ipynb. Make sure to have the conda environment activated. Then execute all cells except the last one.

In the last cell insert a path in line 24 (exp_dir= ...). The path points to a directory exp_dir in which experiment log files are stored. If the directory does not exist it will be created.

To run the training/evaluation for the hyperparameter study (beta parameter) replace the line

methods = available_methods  

with

methods = { 'mc_mod_sml1', 'mc_mod_sml25', 'mc_mod_sml', 'mc_mod_sml75', 'mc_mod_sml90'}

This allows to start the training with the SML loss using different values for beta.

Run the last cell to start the experiment. After the training finishes, exp_dir contains several subdirectories for each dataset. Those subdirectories contain zipped dictionaries that contain the performance/uncertainty metrics on train/test data, see next section.

Model Evaluation

To visualize evaluation results, another notebook is provided: model_evaluation_and_plots.ipynb.

  • First run cells 1-4.

  • To create the plots in Figure 12,

    • first run cells 5, 6.
    • Set the path in the exp_dirs list. It should point to the exp_dir used when running the training/evaluation script.
    • Finally, run cell 7. The plots will be stored in the ./plots directory.
  • To create the plots in Figure 2,

    • first run cells 8, 9.
    • In cell 10 change the exp_dir path accordingly. It should point to exp_dir that contains a subdirectory with the "toy_modulated" dataset.
    • Run cell 10 to create the plots for "toy_modulated". They will be stored in ./plots/toy_modulated.pdf.
  • In cell 11 change the exp_dir path accordingly. It should point to exp_dir that contains a subdirectory with the toy_hf dataset.

    • Run cell 11 to create the plot for toy_hf. It will be stored to ./plots/toy_hf.pdf.
  • To create plots similar to Figures 3, 8, 9, 10,

    • first run cells 12, 13, 14.

    • In cell 15 change the exp_dirs list accordingly. It should contain paths to exp_dir.

    • Run cell 15 afterwards.

    • Run cell 18 to create a plot showing RMSE values for all datasets. Run cell 19 to include "SML_DE" in the plot.

    • Run cell 21 to create a plot showing NLL values for all datasets. Run cell 22 to include "SML_DE" in the plot.

    • Run cell 24 to create a plot showing ECE values for all datasets. Run cell 25 to include "SML_DE" in the plot.

    • Run cell 27 to create a plot showing Wasserstein distance values for all datasets. Run cell 28 to include SML_DE in the plot. All plots will be stored to the ./plots directory.

  • To create the plots similar to Figures 13, 14,

    • change the exp_dirs list of cell 29 accordingly. It should contain paths to EXP_DIR of a hyperparameter training/evaluation run as described above.
  • Then run cells 29-33 to create the plots for the hyperparameter study with regards to RMSE, NLL, ECE and Wasserstein distance. The plots will be stored in the ./plots directory.

  • To create the plot in Figure 15,

    • first run cell 34.
    • Again, change the exp_dirs list of cell 35 accordingly. It should contain paths to exp_dir of a standard training/evaluation run.
    • Then run cells 35, 38, 39, 40 to create the plots. They will be stored to the ./plots directory.

Due to random train-test splits and random initializations, the obtained results might slightly deviate from those reported in the paper.

Used Hardware & Runtimes

All experiments are conducted on a Intel(R) Xeon(R) Gold 6126 CPU @ 2.60GHz. For object detection we additionally used a Tesla V100 GPU.

Conducting the described experiments with cross validation takes 6h (on CPU) for toy data, 80h (on CPU) for UCI regression and 8h (on GPU) for object regression.

License

The SML code is released under the MIT license.

Citing the Second-Moment Loss

If you use or reference the second-moment loss in your research, please use the following BibTeX entry.

@article{sml,
author = {Joachim Sicking and Maram Akila and Maximilian Pintz and Tim Wirtz and Asja Fischer},
title = {Second-Moment Loss: A Novel Regression Objective for Improved Uncertainties},
year = {2020}
}

About

The second-moment loss (SML) is a novel training objective for dropout-based regression networks that yields improved uncertainty estimates.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published