Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with Scikit-Learn Feature Selection #35

Open
sanjaradylov opened this issue May 10, 2022 · 0 comments
Open

Compatibility with Scikit-Learn Feature Selection #35

sanjaradylov opened this issue May 10, 2022 · 0 comments

Comments

@sanjaradylov
Copy link

BaseGroupLasso is implemented as a scikit-learn transformer. As an intermediate step inside a pipeline it works well, but in combination with SelectFromModel (to further remove infinitesimal coefs), it throws an error:

>>> import sklearn, group_lasso
>>> sklearn.__version__, group_lasso.__version__
('1.0.2', '1.5.0')
>>> from group_lasso import GroupLasso
>>> from sklearn.datasets import make_regression
>>> from sklearn.feature_selection import SelectFromModel
>>> from sklearn.linear_model import Ridge
>>> from sklearn.pipeline import make_pipeline
>>> X, y = make_regression(n_features=5, n_informative=3, random_state=0)
>>> pipe = make_pipeline(SelectFromModel(GroupLasso(supress_warning=True)), Ridge())
>>> pipe.fit(X, y)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-3-e63339588fc2>](https://localhost:8080/#) in <module>()
      3 
      4 X, y = make_regression(n_features=5, n_informative=3, random_state=0)
----> 5 pipe.fit(X, y)

6 frames
[/usr/local/lib/python3.7/dist-packages/sklearn/feature_selection/_base.py](https://localhost:8080/#) in _transform(self, X)
    101             return np.empty(0).reshape((X.shape[0], 0))
    102         if len(mask) != X.shape[1]:
--> 103             raise ValueError("X has a different shape than during fitting.")
    104         return X[:, safe_mask(X, mask)]
    105 

ValueError: X has a different shape than during fitting.

Possible solutions:

  1. Add a new parameter min_coef and zero out all the coefficients s.t. np.abs(coef_) < min_coef. Optionally, reimplement TransformerMixin methods by inheriting from SelectorMixin to support get_support() and get_feature_names_out() methods.
    or
  2. Remove fit_transform and transform methods to enable SelectFromModel(Grouplasso(), threshold=min_coef) inside a pipeline.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant