-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add custom converter for ClassificationProgressiveLearner #1041
Conversation
Registers the custom shape calculator for the ClassificationProgressiveLearner class and the DictWrapper class (dependency of the ClassificationProgressiveLearner) Signed-off-by: Panos Doupidis <[email protected]>
Adds proglearn to the registered shape calculators. Signed-off-by: Panos Doupidis <[email protected]>
Implements the custom converter and parser for the ClassificationProgressiveLearner class. Signed-off-by: Panos Doupidis <[email protected]>
added ClassificationProgressiveLearner dependency. Signed-off-by: AvraamBardos <[email protected]>
The purpose of this file is to provide a helper class (DictWrapper), that wraps a dictionary as a scikit-learn estimator and a function to fill missing indices. Signed-off-by: AvraamBardos <[email protected]>
Import DictWrapper and fill_missing_indices. Signed-off-by: AvraamBardos <[email protected]>
This unittest is meant to test the functionality and the validity of the registered converters for the ClassificationProgressiveLearner class. In total, 7 unique test cases are assessed, targeting multiple parameter combinations. Signed-off-by: Panos Doupidis <[email protected]>
Signed-off-by: AvraamBardos <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
I would prefer to see these converters in a separate package. Taking a dependency on tensorflow and proglearn is a huge change. I assume more converters would be added in the future. I'm reluctant to start fixing the converters when they fail because a new version of proglearn was released. sklearn-onnx has an API to register custom converters. Creating a separate package should not be an issue. |
Hi @xadupre, Thank you for taking the time to review the pull request and for your valuable feedback. We understand your concerns about adding a dependency on like TensorFlow and ProgLearn, and the potential issues that could arise with future updates. Your suggestion to create a separate package for the converters is a good one, and we appreciate your insight about the API in sklearn-onnx to register custom converters. We will take your feedback into consideration as we continue to improve our project. Thanks again for your thoughtful review. |
Hi,
This pull request implements a custom converter regarding the
ClassificationProgressiveLearner
estimator as is defined in the neurodata proglearn repository. This package is designed for exploring and using Progressive learning algorithms. More specifically, Progressive learning is a concept in machine learning where the model learns from a series of tasks progressively, leveraging the knowledge gained from previous tasks to improve performance on later tasks. This approach can be particularly useful in scenarios where data comes in a sequential manner or when it’s beneficial to transfer knowledge from one task to another. If you are unfamiliar with the concept or wish to explore it in greater detail, you can find more information in the accompanying paper by Jayanta Dey et al.PR Details
The Progressive learning implementation is split into two separate categories,
TreeClassificationTransformer
andNeuralClassificationTransformer
, each utilizing a different underlying estimator.TreeClassificationTransformer
is based on an ensemble ofscikit-learn
DecisionTreeClassifier
instances (converters already implemented inskl2onnx
) whereas the latter is centered aroundtensorflow/keras
neural networks. Sinceskl2onnx
solely focuses onscikit-learn
estimators, this contribution emphasizes only onTreeClassificationTransformer
to avoid conflicts. Concretely, this contribution is loosely based on thepyod.models.iforest.IForest
custom converter, following its main set of instructions adapted to our needs.Requirements
In an effort to keep the core
skl2onnx
dependencies pure, the originalrequirements.txt
file is not modified, however, as a note the necessary additional dependencies are presented below (excluding the ones already required byskl2onnx
).Testing
The main entry point should be the
test_proglearn.py
file, inside thetests
folder, which defines a total of 7 independent test cases, after registering the external converters/shape calculators. It should be noted at this point that a version ofonnxruntime>=1.8.0
is required to perform these tests.Note: There is a known backtracking issue with pip which may result in multiple versions of the packages being installed. A quick fix is to use the
--use-deprecated legacy-resolver
option when runningpip install
.Thank you for taking the time to review this pull request! Looking forward to your insights and suggestions.