-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: SPMD interface for IncrementalEmpiricalCovariance #1941
Conversation
8ddd338
to
2a3fcd5
Compare
self._partial_result = BaseEstimator._get_backend( | ||
self, "covariance", None, "partial_compute_result" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about inheriting BaseEstimator
in class definition instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BaseEmpiricalCovariance
is inherited from BaseEstimator
. The trick here is that we inherit from BaseEstimatorSPMD
which also has _get_backend
. Thus, if we put self._get_backend
here then SPMD backend would be called which does not contain partial_compute_result
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer a definition of _reset in the SPMD interface which uses a super call for locality code. That would be simpler for maintainers in the future to see why certain things are done. At a minimum it needs a comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also necessary to redefine partial_fit
because partial_compute
also does not exist in the SPMD backend. I'm not sure if the code duplication is better idea than the currently implemented one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and also have to add that super call would not work there because BaseEstimator and BaseEstimatorSPMD have the same methods and if the class has both of them as parents we anyway need to specify directly which of them should be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would ask you do refactoring after this PR merged. Currently this looks like a workaround.
Create some BaseIncreamenatlEstimator
, where get_backend method depending on provided method name returns required backend. This will be common for all incremental algos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create some
BaseIncreamenatlEstimator
, where get_backend method depending on provided method name returns required backend. This will be common for all incremental algos.
this might be a good idea, I'll think about it
if not hasattr(self, "_queue"): | ||
self._queue = queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this necessary? It should be handled by universal functionality, not in estimators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is for finalize_fit dispatching. It does not have data argument, so, in case if user does not provide queue explicitly then the last queue from partial_fit is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't the last queue be extracted from the stored policy via the _queue
property? A lot of this logic might be unnecessary. @olegkkruglov let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was done like that before this PR. but now it turned out that different policies must be used in finalize_fit and partial_fit that's why stored policy from partial fit is not acceptable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's just partial_fit and reset which requires a data parallel policy, I would overload partial_fit and reset in the spmd interface, and make sure that the correct policy is taken there via a super call. A small duplication of code, but at least its clear to the developer and the user on what is going on, and its located in a place that someone looking to understand the spmd interface can see the limitations of partial_fit for spmd in the incremental algos. It would be then more straightforward for @samir-nasibli 's request for a refactor with an Incremental base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is for finalize_fit dispatching. It does not have data argument, so, in case if user does not provide queue explicitly then the last queue from partial_fit is used.
Maybe it make sense explicitly ask user provide sycl queue? Otherwise it is headache
I understand that this is based on the onedal api, but it seems the real fix should be on the onedal side. I didn't find any example spmd incremental with use of policies. The interface itself for onedal user seems inconvenient to me. It makes sense to redesign of API there first and then expose it here, to sklearnex level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finalize_fit is called implicitly on sklearnex side. if we want to keep scikit-like interface (without explicit finalize call) the only option is to call finalize after every call of partial_fit. this option was rejected on arch meeting, that's why keeping queue in attributes is currently unavoidable.
/intelci: run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some high level questions about the policy queues and the backend. Also comments would be nice.
self._partial_result = BaseEstimator._get_backend( | ||
self, "covariance", None, "partial_compute_result" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer a definition of _reset in the SPMD interface which uses a super call for locality code. That would be simpler for maintainers in the future to see why certain things are done. At a minimum it needs a comment in the code.
if not hasattr(self, "_queue"): | ||
self._queue = queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't the last queue be extracted from the stored policy via the _queue
property? A lot of this logic might be unnecessary. @olegkkruglov let me know.
5544ed6
to
3f643f9
Compare
/intelci: run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the overloading partial_fit and _reset in the SPMD class is more preferrable to changes in underlying classes and storing queues. It would be simpler to understand, spmd problems should stay to spmd.
if not hasattr(self, "_queue"): | ||
self._queue = queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's just partial_fit and reset which requires a data parallel policy, I would overload partial_fit and reset in the spmd interface, and make sure that the correct policy is taken there via a super call. A small duplication of code, but at least its clear to the developer and the user on what is going on, and its located in a place that someone looking to understand the spmd interface can see the limitations of partial_fit for spmd in the incremental algos. It would be then more straightforward for @samir-nasibli 's request for a refactor with an Incremental base class.
Why is storing queue worse than storing policy? As far as I see, storing queue instead of policy is unavoidable because the policy from partial_fit can't be used in finalize_fit and it is the only thing the policy was stored for. |
/intelci: run |
fa06cc2
to
714a932
Compare
/intelci: run |
1 similar comment
/intelci: run |
a7e9085
to
e5458b3
Compare
done |
/intelci: run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approval dependent on @samir-nasibli 's requests (docstrings at least).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @olegkkruglov !
Now looks good to me!
Assuming green CI, please share internal CI job link.
Expecting quick follow up refactoring, based on the tickets created and for the docstrings mentioned.
I am good to go functionally with this PR as is, but the refactoring is deserved before CF.
Description
finalize_fit
requiresspmd_policy
, butpartial_fit
requiresdata_parallel_policy
on oneDAL sidefinalize_fit
now uses provided queue for computations on onedal4py side.