Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get core Dask functionality from cfdm #836

Merged
merged 31 commits into from
Jan 15, 2025

Conversation

davidhassell
Copy link
Collaborator

@davidhassell davidhassell commented Dec 5, 2024

Fixes #839

Remove core Dask functionality, importing it from cfdm instead.

Requires NCAS-CMS/cfdm#312 from cfdm.

Aims to not change the API, other than as required by the new cfdm.

@davidhassell davidhassell added this to the NEXT VERSION milestone Dec 5, 2024
Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall except for the case of the updated API for the binary operations stemming from a change to _binary_operation (which may have been deliberate but I suspect not) - see my inline comments regarding that which summarise my thoughts.

Otherwise just the usual minor comments. So happy to approve once we've considered/discussed the above issue. Thanks for your patience on this review.

Changelog.rst Show resolved Hide resolved
Changelog.rst Outdated Show resolved Hide resolved
Comment on lines +1323 to +1325
new_data = field0.data._binary_operation(
field0.data, field1.data, method
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this API whereby we are specifying field0.data as both the object of the method and an argument - is unnecessary and could attract bugs if the object and the first argument are different. I actually imagine you didn't intend this and missed this detail in the translation, but just to check?

It looks like the issue (assuming it wasn't intended as above) stems from including and using the Data data as an argument to def _binary_operation(cls, data, other, method) in the method translated to cfdm. If we drop the superfluous argument data then it should be fine - but this will mean quite a few updates here and perhaps in the corresponding cfdm PR too, e.g. from a git grep "_binary_oper" in the cf directory I see there are many methods to tweak:

data/data.py:        return self._binary_operation(self, other, "__add__")
data/data.py:        return self._binary_operation(self, other, "__iadd__")
data/data.py:        return self._binary_operation(self, other, "__radd__")
data/data.py:        return self._binary_operation(self, other, "__sub__")
data/data.py:        return self._binary_operation(self, other, "__isub__")
data/data.py:        return self._binary_operation(self, other, "__rsub__")
data/data.py:        return self._binary_operation(self, other, "__mul__")
data/data.py:        return self._binary_operation(self, other, "__imul__")
data/data.py:        return self._binary_operation(self, other, "__rmul__")
data/data.py:        return self._binary_operation(self, other, "__div__")
data/data.py:        return self._binary_operation(self, other, "__idiv__")
data/data.py:        return self._binary_operation(self, other, "__rdiv__")
data/data.py:        return self._binary_operation(self, other, "__floordiv__")
data/data.py:        return self._binary_operation(self, other, "__ifloordiv__")
data/data.py:        return self._binary_operation(self, other, "__rfloordiv__")
data/data.py:        return self._binary_operation(self, other, "__truediv__")
data/data.py:        return self._binary_operation(self, other, "__itruediv__")
data/data.py:        return self._binary_operation(self, other, "__rtruediv__")
data/data.py:        return self._binary_operation(self, other, "__pow__")
data/data.py:        return self._binary_operation(self, other, "__ipow__")
data/data.py:        return self._binary_operation(self, other, "__rpow__")
data/data.py:        return self._binary_operation(self, other, "__mod__")
data/data.py:        return self._binary_operation(self, other, "__imod__")
data/data.py:        return self._binary_operation(self, other, "__rmod__") 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many I won't make (GH) 'suggestions' here on the PR, but I'll emphasise this one on the main review comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to remember why I thought is was necessary to make _binary_operation a class method. I'll get back to you ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries - the more reviewing I did the more I changed my mind and thought it might indeed be deliberate, to update my comment above!

@@ -622,11 +621,11 @@ def _binary_operation(self, y, method):

if not inplace:
new = self.copy() # data=False) TODO
new_data = data._binary_operation(y, method)
new_data = data._binary_operation(data, y, method)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Relates to a previous comment) why need this to be specified when it is the object the method operates on?

cf/regrid/regrid.py Show resolved Hide resolved
cf/data/data.py Outdated Show resolved Hide resolved
cf/data/data.py Outdated Show resolved Hide resolved
cf/data/data.py Show resolved Hide resolved
cf/data/data.py Show resolved Hide resolved
cf/data/data.py Outdated Show resolved Hide resolved
davidhassell and others added 7 commits January 8, 2025 09:44
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
@davidhassell davidhassell added enhancement New feature or request dask Relating to the use of Dask labels Jan 15, 2025
Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments left to address, some newly-introduced, but then I am happy, so please merge 🙂

Changelog.rst Show resolved Hide resolved
cf/__init__.py Outdated Show resolved Hide resolved
Changelog.rst Outdated Show resolved Hide resolved
davidhassell and others added 2 commits January 15, 2025 17:28
Co-authored-by: Sadie L. Bartholomew <[email protected]>
@davidhassell davidhassell merged commit 9e29106 into NCAS-CMS:main Jan 15, 2025
@davidhassell davidhassell deleted the dask-in-cfdm branch January 15, 2025 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask Relating to the use of Dask enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import Dask from cfdm
2 participants