ENH: avoid particle_index type cast #4996

chrishavlin · 2024-09-19T17:38:41Z

Updated Summary (2025-01-23)

This PR adds some logic to be able to set and preserve data types for particle fields. The logic defaults to float64 for all fields, except for the particle_index, which is set to int64. Some caveats, limitations and comments:

data types are only preserved for direct reads -- any operations will result in a cast to float
only float64 or int64 are allowed
it will only impact frontends that do not override the base _read_particle_selection, which is most of them, but see this comment for those that don't (and would need separate updates): ENH: avoid particle_index type cast #4996 (comment)
adding other fields (in addition to particle_index) to be set to int64 would be possible in each frontend by overriding the _particle_dtypes attribute added in this PR. But if they are fields that are used in computations that are passed down to cython functions, it's possible some errors will be exposed (though in my investigation, all the calls to cython functions explicitly cast to float or have a unyt operation, which implicitly casts to float, before the call).

Original comment

This is a possible fix for #4995 which would affect all the particle frontends that don't override _read_particle_selection.

chrishavlin · 2024-09-19T17:41:33Z

Converting to draft because this needs some more thorough testing: namely, does this introduce any bugs in cython operations that are expecting float arrays? Only tried it out with a projection and it works (because the int field gets converted to float before the projection), but need to check other functionality.

chrishavlin · 2024-09-23T20:03:37Z

Assuming tests pass again, I think this is ready for review.

For reference, the following IO child classes override _read_particle_selection and always cast all fields to float64:

IOHandlerGadgetFOFHaloHDF5
IOHandlerChomboHDF5 (actual cast to float occurs in ._read_particles)
IOHandlerOpenPMDHDF5

The one other IO frontend that overrides _read_particle_selection is IOHandlerOrion but it does not initialize dtypes of arrays or cast to float64. Every other fronted inherits the base _read_particle_selection so would be affected by this PR. Based on my quick read, it'd be easy enough to enable the behavior of this PR in the 3 frontends above and I can do that here after initial review or in a followup PR or I can just open an issue for future reference.

chrishavlin · 2024-09-23T20:05:12Z

oh, I suppose I should update the docs to mention this behavior, but I'll wait for review before doing that in case the functionality changes.

neutrinoceros · 2024-09-24T07:43:29Z

As I said somewhere else I'm currently unable to compile yt so I cannot review this properly at the moment. I hope someone else can step in.

chrishavlin · 2024-09-24T14:35:36Z

yt/data_objects/selection_objects/data_selection_objects.py

+            if finfos[f].units != finfos[f].output_units:
+                self.field_data[f].convert_to_units(finfos[f].output_units)


note to reviewers: this change is necessary because unit conversions on unyt arrays of int dtype always cast the result to float64. e.g., unyt.unyt_array([1, 2], "1", dtype='int').to("1") yields an array of type float64. So only converting if the units differ allow the int fields to persist through the data read.

matthewturk

This seems good to me as a first pass.

I want to note one other possible place we may want to update things. I think it might be too much work, but could be simplified with a broader particle type accessing refactor. (i.e., making it easier to figure out what type to make new fields, etc.)

I can see use cases for making the smoothing and deposition operations utilize integers. Specifically, if one wanted to deposit the nearest particle's index, that would be a use case we could support. But, it's not necessary for this.

I would also like to suggest that we avoid (for now) allowing 32bit floats and ints, which you've also avoided here.

yt/utilities/io_handler.py

matthewturk · 2024-09-24T19:03:21Z

One thing that came to mind, in addition -- using fused types we can make it somewhat easier to address issues in the cython code, but we could also utilize JIT's for fun in the distant future.

chrishavlin · 2024-09-24T19:27:26Z

I can see use cases for making the smoothing and deposition operations utilize integers. Specifically, if one wanted to deposit the nearest particle's index, that would be a use case we could support. But, it's not necessary for this.

I agree! and I think this PR opens up the possibility of doing just that.

I would also like to suggest that we avoid (for now) allowing 32bit floats and ints, which you've also avoided here.

agreed! I'll add a comment to the new _particle_dtypes attribute to reinforce that in case any frontends in the future end up overwriting the attribute.

yt/utilities/io_handler.py

chrishavlin · 2024-09-25T14:43:17Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

jzuhone · 2025-01-19T23:53:58Z

@chrishavlin what needs to get this across the finish line?

chrishavlin · 2025-01-22T18:58:18Z

hey @jzuhone -- it's just in need of some reviews.

chrishavlin · 2025-01-23T15:45:36Z

just updated the main comment to make this easier to review.

chrishavlin · 2025-01-23T15:55:10Z

@yt-fido test this please

jzuhone · 2025-01-23T16:28:10Z

yt/utilities/io_handler.py

@@ -60,6 +60,22 @@ def push(self, grid, field, data):
            raise ValueError
        self.queue[grid][field] = data

+    @property


My only comment here is that I'd love it if it was more general (as this relates to the use case that @ChunYen-Chen was talking about). But if that is not easy to do in this PR we can just submit another one.

well my thinking was this _particle_dtypes dictionary would be a general, front-end dependent dictionary mapping types to fields. Any ideas on how to make it more general/useable? Looking at it now after some months, I do think it could make sense to instead have this field-dtype mapping defined over in the FieldInfoContainer, but that would still require each frontend to over-ride or add to it as needed.

chrishavlin · 2025-01-23T16:52:00Z

realized i re-triggered the wrong test... decided to just merge with main since it's been so long.

chrishavlin · 2025-01-23T17:23:56Z

that type-checking failure looks unrelated, will open an issue for it.

chrishavlin marked this pull request as draft September 19, 2024 17:38

chrishavlin added enhancement Making something better index: particle labels Sep 19, 2024

neutrinoceros mentioned this pull request Sep 21, 2024

Particle IDs of Gadget HDF5-style data are converted from integers to floats #4995

Open

chrishavlin force-pushed the handle_particle_dtypes branch from 59b5d85 to 9b5c823 Compare September 23, 2024 18:35

avoid particle_index type cast

641e590

chrishavlin force-pushed the handle_particle_dtypes branch from 9b5c823 to 641e590 Compare September 23, 2024 19:29

chrishavlin marked this pull request as ready for review September 23, 2024 20:03

chrishavlin commented Sep 24, 2024

View reviewed changes

matthewturk previously approved these changes Sep 24, 2024

View reviewed changes

yt/utilities/io_handler.py Show resolved Hide resolved

chrishavlin commented Sep 24, 2024

View reviewed changes

yt/utilities/io_handler.py Outdated Show resolved Hide resolved

chrishavlin dismissed matthewturk’s stale review via e6315d7 September 24, 2024 19:31

add comment of 32-bit int, float in _particle_dtypes

e6315d7

[pre-commit.ci] auto fixes from pre-commit.com hooks

26abd0d

for more information, see https://pre-commit.ci

ChunYen-Chen mentioned this pull request Jan 17, 2025

Update gamer frontend in yt to support integer particle attributes gamer-project/gamer#411

Open

jzuhone reviewed Jan 23, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/main' into handle_particle_dtypes

c3f8d0f

chrishavlin mentioned this pull request Jan 23, 2025

mypy failure: unused error code in numpy2_compat.py #5100

Closed

Merge remote-tracking branch 'upstream/main' into handle_particle_dtypes

af7bb93

jzuhone approved these changes Jan 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: avoid particle_index type cast #4996

ENH: avoid particle_index type cast #4996

chrishavlin commented Sep 19, 2024 •

edited

Loading

chrishavlin commented Sep 19, 2024

chrishavlin commented Sep 23, 2024 •

edited

Loading

chrishavlin commented Sep 23, 2024 •

edited

Loading

neutrinoceros commented Sep 24, 2024

chrishavlin Sep 24, 2024 •

edited

Loading

matthewturk left a comment

matthewturk commented Sep 24, 2024

chrishavlin commented Sep 24, 2024

chrishavlin commented Sep 25, 2024

jzuhone commented Jan 19, 2025

chrishavlin commented Jan 22, 2025

chrishavlin commented Jan 23, 2025

chrishavlin commented Jan 23, 2025

jzuhone Jan 23, 2025

chrishavlin Jan 23, 2025

chrishavlin commented Jan 23, 2025

chrishavlin commented Jan 23, 2025

		if finfos[f].units != finfos[f].output_units:
		self.field_data[f].convert_to_units(finfos[f].output_units)

ENH: avoid particle_index type cast #4996

Are you sure you want to change the base?

ENH: avoid particle_index type cast #4996

Conversation

chrishavlin commented Sep 19, 2024 • edited Loading

Updated Summary (2025-01-23)

Original comment

chrishavlin commented Sep 19, 2024

chrishavlin commented Sep 23, 2024 • edited Loading

chrishavlin commented Sep 23, 2024 • edited Loading

neutrinoceros commented Sep 24, 2024

chrishavlin Sep 24, 2024 • edited Loading

Choose a reason for hiding this comment

matthewturk left a comment

Choose a reason for hiding this comment

matthewturk commented Sep 24, 2024

chrishavlin commented Sep 24, 2024

chrishavlin commented Sep 25, 2024

jzuhone commented Jan 19, 2025

chrishavlin commented Jan 22, 2025

chrishavlin commented Jan 23, 2025

chrishavlin commented Jan 23, 2025

jzuhone Jan 23, 2025

Choose a reason for hiding this comment

chrishavlin Jan 23, 2025

Choose a reason for hiding this comment

chrishavlin commented Jan 23, 2025

chrishavlin commented Jan 23, 2025

chrishavlin commented Sep 19, 2024 •

edited

Loading

chrishavlin commented Sep 23, 2024 •

edited

Loading

chrishavlin commented Sep 23, 2024 •

edited

Loading

chrishavlin Sep 24, 2024 •

edited

Loading