Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Fitting MultiRocketHydraClassifier with class_weight="balanced_subsample" gives an error #2434

Closed
dschrempf opened this issue Dec 9, 2024 · 9 comments
Labels
classification Classification package documentation Improvements or additions to documentation

Comments

@dschrempf
Copy link
Contributor

Describe the bug

Fitting MultiRocketHydraClassifier with class_weight="balanced_subsample" gives an error.

Steps/Code to reproduce the bug

from aeon.classification.convolution_based import MultiRocketHydraClassifier
from aeon.testing.data_generation import make_example_3d_numpy

X, y = make_example_3d_numpy(n_cases=10, n_channels=1, n_timepoints=12, random_state=0)
clf = MultiRocketHydraClassifier(random_state=0, class_weight="balanced_subsample")
clf.fit(X, y)

Expected results

Classifier is fitted.

Actual results

Traceback (most recent call last):
  File "/home/mext/mext-dom-aeon/spikes/eeg/test.py", line 6, in <module>
    clf.fit(X, y)
  File "/home/mext/mext-dom-aeon/.venv/lib/python3.9/site-packages/aeon/classification/base.py", line 114, in fit
    self._fit(X, y)
  File "/home/mext/mext-dom-aeon/.venv/lib/python3.9/site-packages/aeon/classification/convolution_based/_mr_hydra.py", line 127, in _fit
    self.classifier.fit(Xt, y)
  File "/home/mext/mext-dom-aeon/.venv/lib/python3.9/site-packages/sklearn/base.py", line 1466, in wrapper
    estimator._validate_params()
  File "/home/mext/mext-dom-aeon/.venv/lib/python3.9/site-packages/sklearn/base.py", line 666, in _validate_params
    validate_parameter_constraints(
  File "/home/mext/mext-dom-aeon/.venv/lib/python3.9/site-packages/sklearn/utils/_param_validation.py", line 95, in validate_parameter_constraints
    raise InvalidParameterError(
sklearn.utils._param_validation.InvalidParameterError: The 'class_weight' parameter of RidgeClassifierCV must be an instance of 'dict', a str among {'balanced'} or None. Got 'balanced_subsample' instead.

Versions

System: python: 3.9.20 (main, Oct 16 2024, 04:36:33) [Clang 18.1.8 ] executable: /home/.../.venv/bin/python3 machine: Linux-6.8.0-49-generic-x86_64-with-glibc2.39 Python dependencies: aeon: 1.0.0 pip: None setuptools: 75.6.0 scikit-learn: 1.5.2 numpy: 1.26.4 numba: 0.60.0 scipy: 1.13.1 pandas: 2.2.3
@dschrempf dschrempf added the bug Something isn't working label Dec 9, 2024
@MatthewMiddlehurst MatthewMiddlehurst added the classification Classification package label Dec 11, 2024
@MatthewMiddlehurst
Copy link
Member

MatthewMiddlehurst commented Dec 11, 2024

Thanks for the report, I will try take a look.

@baraline
Copy link
Member

baraline commented Dec 12, 2024

I think the issue is that balanced subsample is a strategy only supported by ensembles if I remember correctly?

The current classifier used here is a ridgeCV which can't support this strategy. I'm not 100% familiar with Hydra MR, but either :

  • the aeon documentation is wrong and balanced subsample is not supported.
  • the classification strategy is not correctly implemented for Hydra MR (since don't remember whether the algorithm should be an ensemble classifier or not)

@MatthewMiddlehurst
Copy link
Member

This is passed directly to the classifier, seems like a documentation issue.

@Akhil-Jasson
Copy link
Contributor

I reviewed the documentation, but I’m unable to find any reference to balanced_subsample being supported by MultiRocketHydraClassifier.

Could you point me to the specific part of the documentation where this is mentioned?

@baraline
Copy link
Member

The issue was that balanced_subsample is not supported by the default RidgeCV estimator used in mr hydra. So the goal of the issue is to change the docstring here to describe that the class_weight parameter is given to the sklearn estimator and the supported values will be the ones of the estimators.

We could also add a check for that, but that might be one of the goal of #1777.

@Akhil-Jasson
Copy link
Contributor

Akhil-Jasson commented Jan 20, 2025

Wouldn't this particular issue be better categorized as a DOC rather than a BUG?

@dschrempf
Copy link
Contributor Author

In retrospect, I would say yes. I am changing the title.

@dschrempf dschrempf changed the title [BUG] Fitting MultiRocketHydraClassifier with class_weight="balanced_subsample" gives an error. [DOC] Fitting MultiRocketHydraClassifier with class_weight="balanced_subsample" gives an error Jan 20, 2025
@Akhil-Jasson
Copy link
Contributor

Hi @baraline, could you please take a look at my PR when you get a chance? Let me know if there's anything else I should address or look into. Thanks!

@MatthewMiddlehurst MatthewMiddlehurst added documentation Improvements or additions to documentation and removed bug Something isn't working labels Jan 22, 2025
@MatthewMiddlehurst
Copy link
Member

I think this case has been fixed. Thanks, will make sure have new PRs check for this before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
classification Classification package documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants