-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ergonomic improvements to concatenation scripts #403
Conversation
This all seems fine to me, but I'd like @helenarichie or @bcaddy to take a look, since I think they are the main ones who have been using the scripts in their existing form. |
I think this looks good! I like having the default behavior be that you don't have to specify the number of processes. It might be nice to also do that with the snapshot numbers at some point. I also don't have any issues with changing the way the snapshot numbers are specified. I am having a little bit of trouble figuring out which arguments are required/what the default behavior is and was wondering if you could clarify. For example, if I run this and my input directory contains 2D and 3D snapshots will it just go ahead and concatenate all of them unless I tell it to omit certain types of output? Sorry if I'm missing something obvious! It would also probably be good to update the documentation with these changes. |
And it is less error-prone!
I think that could definitely be nice! One of my goals here is achieving more sensible default behavior with fewer arguments (along those lines, it could be cool to do something along those lines with the My next primary concern is introducing some level of parallelism.
The behavior of For
Once we add support for concatenating particles, I might support
I'm happy to do that. I might hold off until I have your approval/this is merged. |
This is great. I'm totally for it |
Ah, I see. I had gotten myself a little confused before... I appreciate the clarification! I think this is good to go! |
I'll go ahead and merge this, @mabruzzo feel free to update the wiki when you get a chance. |
This introduces a few changes to the concatenation scripts. These are all surface-level changes (they don't affect the output format -- that is still something I want to address).
There are essentially 3 changes:
concat.py
that can be used to concatenate more than one kind of dataset at a time (3D-cubes or multiple kinds of 2D datasets).--concat-outputs
with--snaps
(this is a subjective improvement)--snaps
is of the formSTART:STOP[:STEP]
(just like a python slice). For comparison, you use would specify a range withFIRST-LAST
with--concat-outputs
(LAST
is included in the range)--concat-outputs
, but you get a deprecation warning--num-processes
, but you get a warning about it.I also haven't really touched the particle-concatentation script. Based on how code is shared, change number 2 will affect that script. But propagating change 3 would take some additional work.
I'm definitely open to any kind of feedback!
In a subsequent PR, it's my intention to modify things so that we can optionally use multiprocessing or mpi4py to speed up concatenation (I have an idea on how to do that in a minimally invasive manner that won't affect hypothetical dask-usage).