Add GLM Support #763

s3alfisc · 2024-12-28T11:24:04Z

Introduces pf.feglm(), which allows to fit GLMs with Gaussian & Binomial families (Logit and Probit).

In Python:

%load_ext autoreload
%autoreload 2

import pyfixest as pf
import numpy as np

data = pf.get_data()
data["Y"] = np.where(data["Y"] > 0, 1, 0)

fit_ols = pf.feols("Y ~ X1", data = data)
fit_gaussian = pf.feglm("Y ~ X1", data = data, family = "gaussian")
fit_probit = pf.feglm("Y ~ X1", data = data, family = "probit")
fit_logit = pf.feglm("Y ~ X1", data = data, family = "logit")

pf.etable([fit_ols, fit_gaussian, fit_probit, fit_logit],digits = 6)

In R:

library(reticulate)
library(fixest)
data = py$data
fml = Y ~ X1

fit_ols = feols(Y~X1, data = data)
fit_gaussian =feglm(Y ~ X1, data = data, family = "gaussian")
fit_probit = feglm(Y ~ X1, data = data, family = binomial(link = "probit"))
fit_logit = feglm(Y ~ X1, data = data, family = binomial(link = "logit"))

etable(fit_ols, fit_gaussian, fit_probit, fit_logit)

To do:

point estimates match
inference matches
prediction matches

Only apparent issues / error (?):

IWLS sometimes fails, even with step halfing. Due to separation? Seems to be mostly probit, so likely implementation error. -> fixed, incorrect deviance implementation
- scores for Gaussian match OLS scores in pyfixest but not fixest (Gaussian scores in fixest <> OLS scores, not sure why).

juanitorduz

Here are some minor code-style comments. I am unfamiliar with the numerical estimation details, so someone else can help review them.

docs/quickstart.qmd

juanitorduz · 2024-12-29T18:13:44Z

pyfixest/estimation/feglm_.py

+            weights_type=weights_type,
+            # tol=tol,
+            # maxiter=maxiter,
+            collin_tol=collin_tol,


Why these commented lines ? :)

It's a relict from an earlier implementation in which the Feglm class inherited from Fepois, which has these arguments. Now it inherits from Feols, which does not have them, so both arguments are set in Feglm.init. I've cleaned this up, coming with the next commit.

pyfixest/estimation/feglm_.py

juanitorduz · 2024-12-29T18:15:45Z

pyfixest/estimation/feglm_.py

+
+            # Step 1: _get weights w_tilde(r-1) and v(r-1) (eq. 2.5)
+            detadmu = self._update_detadmu(mu=mu)
+            # v = self._update_v(y=_Y.flatten(), mu=mu, detadmu=detadmu)


commented line?

pyfixest/estimation/feglm_.py

juanitorduz · 2024-12-29T18:17:26Z

pyfixest/estimation/feglm_.py

+                v_dotdot=v_dotdot,
+                X_dotdot=X_dotdot,
+                deviance_old=deviance_old,
+                step_halfing_tolerance=1e-12,


do you a case where the user would need to modify the step_halfing_tolerance parameter?

I checked the glmnet API, and they do not seem to support setting the step halfing tolerance: link. For now I would maybe not support it, but add it in case users request it?

pyfixest/estimation/feglm_.py

juanitorduz · 2024-12-29T18:21:00Z

pyfixest/estimation/feglm_.py

+        return Z.T @ W @ v
+
+    def _get_diff(self, deviance: np.ndarray, last: np.ndarray) -> np.ndarray:
+        return np.abs(deviance - last) / (0.1 + np.abs(last))


why 0.1? It clearly seems to be numerical stability, but could it be smaller (just curious)?

I actually don't know - I am using this as it is mimic fixest, which itself follows R's glm implementation, which is using this convergence criterion: link

juanitorduz · 2025-01-01T20:27:23Z

I had some issues locking the file in #767 (initial commits). I had to make sure to update the environments that I really needed. It seems a total update is breaking something (in case it helps)

s3alfisc · 2025-01-04T14:17:43Z

Tests fail due to an incorrect assumption on formulaic behavior, which seems to have changed with version 1.1.0. Setting formulaic requirements to <1.1.0 for now. Details here: #777

s3alfisc added 30 commits December 21, 2024 18:53

initial commit

32bbf90

method for step halfing

e56725f

cleanups and replace Stamann eta update; works now without fixed effects

08ea97f

Logit without fixed effects works

3d316da

add probit class

8da865d

delete Fepois

eaa6b19

type hints

c293dec

type hints

37d31a3

rename to get_fit

f3bf0a3

add some tests

2c7c323

add gaussian, fix crit bug

5a19b9d

no coefficient crit

6f6aded

more tests

8a9cc28

test run N = 100

e3b9139

inherit from Feols, add predict method

6ae6bbe

vcov reorg

82f5ef3

add depvar checks

e85b8bb

update weights

c782636

gaussian gives identical results

2d4c1b5

gaussian works except iid

8d0f06d

fix irls_weights, add to tests

6acf35d

update tests

1054284

test cleanup

26819e2

iid hetero match

2f3a294

work with scores vor crv1

dbc7b93

crv1 inference passes for glms

f762040

error when glms with fixed effects

9486a07

dones test scores for gaussian vs fixest

8473c9e

mention GLMs in quickstart

31c8fd2

mention GLMs in quickstart

36b94b3

s3alfisc marked this pull request as draft December 28, 2024 11:40

s3alfisc added 5 commits December 28, 2024 16:38

custom iid method for poisson

ef01859

compare deviance in tests

e44c0e0

fix probit deviance -> no more convergence problems

43991e7

avg_slopes to docs

54f5deb

pyarrow to docs dependencies

0c5d211

s3alfisc mentioned this pull request Dec 28, 2024

GLM Support #758

Closed

4 tasks

s3alfisc added 3 commits December 28, 2024 22:09

fix error in docs

917347a

update changelog

821d9b3

another docs bug

c3feec9

juanitorduz reviewed Dec 29, 2024

View reviewed changes

s3alfisc added 2 commits December 30, 2024 19:26

adress Juan's comments

642cf6a

udate readmes

9d37b0a

s3alfisc marked this pull request as ready for review January 1, 2025 13:19

s3alfisc added 5 commits January 1, 2025 18:13

update lock file

12b3a5a

align pyproject toml and lock with main

355e837

Merge branch 'master' into glm

3f4a175

update lock file

8b5ad8c

small cleanups

9984633

s3alfisc added 5 commits January 2, 2025 22:18

Merge branch 'master' into glm

d7055a1

merge with master

915f309

enable context for glms

e75cfa6

enable context for glms

38d83f5

enforce formulaic < 1.1.0

224bcc6

s3alfisc added 2 commits January 4, 2025 15:19

Merge branch 'master' into glm

3c71ab9

add pyarrow to docs deps, required by marginaleffects

de787e5

s3alfisc merged commit 83a6f27 into master Jan 4, 2025
9 checks passed

s3alfisc deleted the glm branch January 4, 2025 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GLM Support #763

Add GLM Support #763

s3alfisc commented Dec 28, 2024 •

edited

Loading

juanitorduz left a comment

juanitorduz Dec 29, 2024

s3alfisc Dec 30, 2024

juanitorduz Dec 29, 2024

juanitorduz Dec 29, 2024

s3alfisc Dec 30, 2024

juanitorduz Dec 29, 2024

s3alfisc Dec 30, 2024

juanitorduz commented Jan 1, 2025

s3alfisc commented Jan 4, 2025

Add GLM Support #763

Add GLM Support #763

Conversation

s3alfisc commented Dec 28, 2024 • edited Loading

juanitorduz left a comment

Choose a reason for hiding this comment

juanitorduz Dec 29, 2024

Choose a reason for hiding this comment

s3alfisc Dec 30, 2024

Choose a reason for hiding this comment

juanitorduz Dec 29, 2024

Choose a reason for hiding this comment

juanitorduz Dec 29, 2024

Choose a reason for hiding this comment

s3alfisc Dec 30, 2024

Choose a reason for hiding this comment

juanitorduz Dec 29, 2024

Choose a reason for hiding this comment

s3alfisc Dec 30, 2024

Choose a reason for hiding this comment

juanitorduz commented Jan 1, 2025

s3alfisc commented Jan 4, 2025

s3alfisc commented Dec 28, 2024 •

edited

Loading