Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twice differentiable LASSO #36

Open
pzivich opened this issue Feb 8, 2024 · 3 comments
Open

Twice differentiable LASSO #36

pzivich opened this issue Feb 8, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request Estimating-Equation Request for new estimating equation

Comments

@pzivich
Copy link
Owner

pzivich commented Feb 8, 2024

Is your feature request related to a problem? Please describe.

The current LASSO technically is invalid with the sandwich. That is because the score is not differentiable at zero. There are alternatives penalties, that mimic LASSO, where the score is differentiable everywhere. These might be better to include as a sort of default option (other LASSO will be maintained, but would be good to have a valid version as the default).

Describe the solution you'd like

There are various different penalties that could be added, beside the bridge penalty. For LASSO specifically, the dLASSO is one option

https://arxiv.org/pdf/1609.04985.pdf

The only issue is that this penalty will be biased. Some SCAD-dLASSO hybrid would be ideal, as it would solve both the twice differentiable and unbiased part of the penalization. I haven't seen this proposed, so would either need to find such a paper or I would need to put out some methods paper on it first.

Describe alternatives you've considered

Leave as-is. Don't add anymore penalized regression.

Additional context

None

@pzivich pzivich added the enhancement New feature or request label Feb 8, 2024
@pzivich
Copy link
Owner Author

pzivich commented Feb 8, 2024

More on SCAD and what the penalty looks like

https://andrewcharlesjones.github.io/journal/scad.html

@pzivich
Copy link
Owner Author

pzivich commented Mar 4, 2024

Below is an experimental example of dLASSO

import numpy as np
import pandas as pd
from delicatessen import MEstimator
from delicatessen.estimating_equations import ee_regression, ee_ridge_regression
from delicatessen.utilities import standard_normal_cdf, standard_normal_pdf

np.random.seed(8509353)
n = 50
d = pd.DataFrame()
d['X1'] = np.random.normal(size=n)
d['X2'] = np.random.normal(size=n)
d['X3'] = np.random.normal(size=n)
d['X4'] = np.random.normal(size=n)
d['X5'] = np.random.normal(size=n)
d['X6'] = np.random.normal(size=n)
d['X7'] = np.random.normal(size=n)
d['X8'] = np.random.normal(size=n)
d['X9'] = np.random.normal(size=n)
d['Y'] = 5 + d['X1'] - 0.5*d['X2'] + np.random.normal(size=n)
d['I'] = 1
x_cols = ['I', 'X1', 'X2', 'X3', 'X4', 'X5', 'X6', 'X7', 'X8', 'X9']


def psi(theta):
    beta = np.asarray(theta)[:, None]
    y = np.asarray(d['Y'])[:, None]
    X = np.asarray(d[x_cols])

    pred_y = np.dot(X, beta)
    s = 1e-5
    penalty = np.asarray([0., ] + [5., ]*9) / 50
    penalty_terms = penalty[:, None] * (2*standard_normal_cdf(beta/s) + 2*(beta/s)*standard_normal_pdf(beta/s) - 1)

    ee_reg = (((y - pred_y) * X).T - penalty_terms)
    return ee_reg


estr = MEstimator(psi, init=[0., 1., -0.5, ] + [0., ]*7)
estr.estimate(deriv_method='exact')


def psi(theta):
    return ee_ridge_regression(theta=theta, y=d['Y'], X=d[x_cols], model='linear', penalty=[0, ] + [5., ]*9)


estr = MEstimator(psi, init=[0., ]*len(x_cols))
estr.estimate(deriv_method='exact')

I still need to understand the operating characteristics better.

There is also a dSCAD that can be implemented.

@pzivich
Copy link
Owner Author

pzivich commented Mar 29, 2024

Okay, I was not too excited about the dLASSO but I am coming around a bit. I think it solves some of the issues of the standard LASSO in this context. My continuing hesitation is that there is a single paper on this, and I'm not sure how well vetted the theory is.

It is probably fine, but I am going to have to study it more myself...

@pzivich pzivich added the Estimating-Equation Request for new estimating equation label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Estimating-Equation Request for new estimating equation
Projects
None yet
Development

No branches or pull requests

1 participant