-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat(EstimatorReport): Display the feature weights for linear models #1320
Comments
There is one method for both use-cases, I guess that's a typo? Currently you write |
thanks for the typo, it was everywhere, it's fixed now. |
By "linear models" (https://scikit-learn.org/stable/modules/linear_model.html), which models do you mean exactly? FYI, for a very first iteration, we dealt with LinearRegression, Ridge, and Lasso. |
So cool 🤩 |
We need to add a check to see if the data is normalized or not, otherwise it makes no sense to interpret the coefficients If the input estimator is a pipeline, and in the pipeline there is a scaler, then it is good. @glemaitre do you know if there is an internal in scikit-learn to check if the input data is normalized? |
It would also be nice to have this kind of pandas styling: Related to #1351. WDYT? |
Do we? Or should this be a warning when we add them?
With blue and orange, but yes, if it's not much in terms of code, and it doesn't change the nature of the dataframe (I wouldn't use format though, it's something people should be able to choose by themselves) |
IMO strong yes, we can't have a
I like red and green because the underlying data is "ordinal" / "ordered": green conveys positive and read conveys negative using pandas.style does change the nature: it is no longer a pandas dataframe but a pandas datafame styler, hence the API suggested in #1351 in the styler, formatting the number of decimals makes sense IMO: too many decimals has no statistical significance, the last decimals of 2.07464663737764 are just noise ; scikit-learn's classification report also removes too many decimals: ![]() |
About the warningsOk, then for now, since we don't have the warnings implemented and it will be done in the next iteration, let's not add this. styling
|
Many thanks, agreed |
Is your feature request related to a problem? Please describe.
As a Data Scientist, to explain my model and understand the problem I'm trying to solve, I need to check the feature importance by looking at its weights. This should be available linear models only.
Describe the solution you'd like
Describe alternatives you've considered, if relevant
Later, if the object report contains too many accessors, we will group the feature importance and add a parameter to decide which of the feature importance type we want to display.
Additional context
part of epic #1314
The text was updated successfully, but these errors were encountered: