Skip to content

wassimmazouz/benchmark_glm

Repository files navigation

Generalized Linear Models (GLM) Benchmark

This repository is dedicated to benchmarking GLMs using the Benchopt framework.

This is a benchmark based on the Benchopt framework. You can learn more about it here.

Theoretical Overview

A generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function. In a generalized linear model, the outcome $\mathbf{Y}$ (dependent variable) is assumed to be generated from a particular distribution in a family of exponential distributions (e.g. Normal, Binomial, Poisson, Gamma). The mean $\mathbf{\mu}$ of the distribution depends on the independent variables $\mathbf{X}$ through the relation:

$$\mathbb{E}[\boldsymbol{Y}|\boldsymbol{X}] = \boldsymbol{\mu} = g^{-1}(\boldsymbol{X},\boldsymbol{\beta})$$

where $\mathbb{E}[\boldsymbol{Y}|\boldsymbol{X}]$ is the expected value of $\boldsymbol{Y}$ conditioned to $\boldsymbol{X}$ , $\boldsymbol{X}\hspace{1pt}\boldsymbol{\beta}$ is the linear predictor and $g(\cdot)$ is the link function.

Use benchopt run -h for more details about the available options, or visit https://benchopt.github.io/api.html.

Generalized Linear Models (GLM) Benchmark

This repository is dedicated to benchmarking GLMs using the Benchopt framework.

About

This is a benchmark based on the Benchopt framework. You can learn more about it here.

Theoretical Overview

A generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function. In a generalized linear model, the outcome $\mathbf{Y}$ (dependent variable) is assumed to be generated from a particular distribution in a family of exponential distributions (e.g. Normal, Binomial, Poisson, Gamma). The mean $\mathbf{\mu}$ of the distribution depends on the independent variables $\mathbf{X}$ through the relation:

$$\mathbb{E}[\boldsymbol{Y}|\boldsymbol{X}] = \boldsymbol{\mu} = g^{-1}(\boldsymbol{X},\boldsymbol{\beta})$$

where $\mathbb{E}[\boldsymbol{Y}|\boldsymbol{X}]$ is the expected value of $\boldsymbol{Y}$ conditioned to $\boldsymbol{X}$ , $\boldsymbol{X}\hspace{1pt}\boldsymbol{\beta}$ is the linear predictor and $g(\cdot)$ is the link function.

Practical Examples

As already mentioned, let $Y$ be the outcome (dependent variable) and $\mathbf{X}$ be the independent variables. The three types of regression analyzed here(Linear, Logistic and Poisson) differ in the nature of $Y$. For each type, ad hoc datasets and solvers were collected.


Linear Regression

In the case of linear regression, $Y$ is modeled as:

$$\begin{cases} \hspace{4pt} Y\sim N(\mu,\sigma^2)\\ \hspace{4pt} \mu = \boldsymbol{X}\hspace{1pt}\boldsymbol{\beta} \end{cases}$$

The following datasets are used:


Logistic Regression

In the case of logistic regression $Y$ is a categorical value (** be sure to have values between $-1$ and $1$ **) and it is modeled as:

$$\begin{cases} \hspace{4pt} Y \sim Bernoulli(\mu)\\ \hspace{4pt} \log(\frac{\mu}{1-\mu}) = \boldsymbol{X}\hspace{1pt}\boldsymbol{\beta} \end{cases}$$

The following datasets are used :


Poisson Regression

In the case of poisson regression, $Y$ is a count value and it is modeled as:

$$\begin{cases} \hspace{4pt} Y \sim Poisson(\mu)\\ \hspace{4pt}\log(\mu) = \boldsymbol{X}\hspace{1pt}\boldsymbol{\beta} \end{cases}$$

For Poisson regression, the following datasets were used :


How to use this benchmark

This benchmark can be run using the following commands:


   $ pip install -U benchopt
   $ git clone https://github.com/wassimmazouz/benchmark_glm
   $ cd benchmark_glm
   $ benchopt run .

Options can be passed to benchopt run, to restrict the benchmarks to some solvers or datasets, e.g.:


	$ benchopt run . -s sklearn -d bcancer --max-runs 10 --n-repetitions 10

Use benchopt run -h for more details about these options, or visit https://benchopt.github.io/api.html.