Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexible snapshot number for data shuffling #570

Merged

Conversation

timcallow
Copy link
Contributor

@timcallow timcallow commented Aug 14, 2024

This PR allows the user to specify an arbitrary number of snapshots during the data shuffling procedure, even if the snapshot number isn't an exact divisor of the number of grid points in each snapshot.

In the non-exact-divisor case, a very small number of data points is "left out". To give an idea what very small means, consider a typical grid of 200x200x200 points, and a user requesting up to 10,000 snapshots. From the graph below, we can see that no more than ~0.1% of the initial data is left out in any of these choices.

Screenshot from 2024-08-14 13-10-54

This PR is a fix for issue #509.

Right now, I've implemented this only for the numpy case, so it's not possible with openPMD. Frankly I've no idea how to do it for the latter. I don't know how many users are currently using openPMD AND want to use an arbitrary snapshot number, if the answer is ~0 then we can just keep this as a todo for the future.

As mentioned in issue #564, I think the whole shuffling procedure could be simplified. But this PR ignores that possibility and integrates the new functionality into the code without making other changes.

Copy link
Member

@RandomDefaultUser RandomDefaultUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @timcallow !

@RandomDefaultUser RandomDefaultUser merged commit 96e983d into mala-project:develop Oct 7, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants