Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure that property importing and calculations use consistent catalogue #117

Open
mtremmel opened this issue Jul 9, 2020 · 5 comments
Open

Comments

@mtremmel
Copy link
Contributor

mtremmel commented Jul 9, 2020

Currently, when importing halo properties from halo stat files generated by a halo finder, the HaloStatFile class iterates through all possibilities until it lands a subclass that works. However, if multiple catalog formats exist this can be a problem. An example is if AHF was run and the IDL post processing also exists (creating simname.amiga.grp, simname.amiga.stat, etc). The IDL version will often omit halos that exist in the full catalog for a zoom run. Yet, if the AHF stat file still exists, the HaloStatFile class will use this file to import halos. Then, when properties are calculated with, for example, pynbody and the user has it set to read the IDL files instead, it will attempt to run the calculations on halos that do not exist in that catalog. Either allowing the user to specify which catalog to use at runtime or including an option to set this in config.py would be useful to avoid these errors.

@apontzen
Copy link
Member

apontzen commented Jul 9, 2020

You could force this by implementing a custom handler I think -- take a look here https://pynbody.github.io/tangos/custom_input_handlers.html

In your custom input handler, you'd want to override get_stat_file and force it to use the particular subclass you want.

@mtremmel
Copy link
Contributor Author

mtremmel commented Jul 9, 2020

It feels to me like it should be easier than writing one's own custom handler. I don't think it is too hard to include it as a command line argument, but would that make things messy in some way? One could imagine someone wanting to create multiple database files for different halo finders. So long as those halo finders are already compatible with TANGOS it should be easy to tell it "use this one".

I was also thinking more generally whether we'd want to ensure the stat_file handler always tries to be consistent with the input_handler, but that might introduce too many problems with the specific ways those different tools select halo catalogs.

@apontzen
Copy link
Member

apontzen commented Jul 9, 2020

Yes, I do on balance think it would be a good idea for the pre-implemented input handlers to select a particular stat file if that is what they are expecting. Then we could add an input handler that you have to explicitly select if you want to use the IDL-generated catalogues? (I'm not sure why you'd want to incidentally?)

@trquinn
Copy link
Contributor

trquinn commented Jul 13, 2020

I think I'm having this problem with one of the small cluster simulations: any "tangos write" fails with "ValueError: Halo 220 does not exist". The "...amiga.stat" file indeed does not have a halo numbered 220. Is there a way around this so I can get this simulation into tangos?

apontzen added a commit that referenced this issue Jul 16, 2020
…ok's toolset),

or to insist on ignoring those catalogues. This is in response to issue #117, where
it has become clear people have both types of catalogue on disk and this can cause
severe confusion due to the automated search for catalogues and statistics files.

To insist on using IDL-generated catalogues, use

  tangos add <name_of_sim> --handler=pynbody.ChangaUseIDLInputHandler

To insist on ignoring IDL-generated catalogues, use

  tangos add <name_of_sim> --handler=pynbody.ChangaIgnoreIDLInputHandler
@apontzen
Copy link
Member

I've tried a fix for this in pull request #120 (branch issue-117). See the notes there for how to use.

Let me know if this helps @trquinn @mtremmel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants