-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grackle Tests #125
Grackle Tests #125
Conversation
A section was also added to the website documentation to describe these tests.
…that is necessary for the Grackle tests.
- fixed a bug in tools/field_summary.py that occurs for newer versions of yt - updated the expected restart values for double-precision
This commit consists of 3 main changesets that: 1. update the reference results for the single precision test 2. addressed some deprecation notices in tools/field_summary.py for modern versions of yt (while maintaining backwards compatability) - when passing a field to yt, we now a tuple including the fluid type (previously we just passed a single string, which caused some confusion) - the script now has yt compute the "weighted_standard_deviation" quantity instead of the "weighted_variance" quantity, when possible. In older versions of yt it falls back to having yt compute the "weighted_variance" quantity (which was incorrectly named - it always returned the weighted standard deviation). 3. tweaked the commands related to running the general_grackle test. Now the report is printed to stdout to make it obvious whether there where issues and exactly what they are
It turns out that this was the reason certain summary statistics were varying by more than 0.1% when I was using 2 different machines with different compiler versions (gcc versions 9.4.0 vs 7.5.0). This was not completely obvious because Grackle itself was never compiled with these flags (although it took me longer to determine than it should have). As far as I can tell most of the differences arose in EnzoInitialGrackleTest. We were already making this change in the CMake PR (enzo-project#139), anyways, so this shouldn't be controversial. As a result of this change, the following modifications were required for existing tests: - the expected values for the VL+CT dual_energy_cloud test needed to be updated. This change was previously discussed in the context of PR enzo-project#139. - the final time for the PPML tests had to be updated for binaries compiled with single precision - the cosmology test had to be disabled for binaries compiled with single precision. As is described in Issue enzo-project#154, it appears the `-ffast-math` had been suppressing an existing bug for this case (the test now crashes without the flag). Note that the test ran perfectly fine when compiled in double-precision. The double-precision results for the Grackle test also had to be updated (but the single-precision results were basically fine). Note that the tolerances for the double-precision results were calibrated so that the tests would pass if you used gcc versions 7.5.0 or 9.4.0 (they may need to be loosened more in the future). This commit also added the ability for the field_summary to accept different tolerances for different summary statistics (this is useful since we might want 0 tolerance for the location of a min/max while still applying a tolerance for the min/max itself). It also a runtime warnings raised by the script.
…ision binary (without the -ffast-math flag)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I had one very minor comment and a somewhat unrelated question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me (one minor correction and one minor docs request).
This comment addresses the following 3 PR comments: 1. We now comment out more fields required by Grackle to further test automatic field creation. 2. Fixed an issue with the length units definition. We had a comment stating that we were defining the length unit to be 1 pc, but were actually setting the length unit to 0.01 pc. I choose not to actually change the length unit definition (as was recommended by @gregbryan) since that would require us to update our reference answers. Instead, I updated the comment. 3. Better documented why we use such long timesteps (i.e. for increasing the chance of exposing SMP bugs) While modifying input/Grackle/grackle.incl, I deleted the "gamma" field. This was a holdover from a much earlier version of EnzoMethodGrackle. The current implementation does not support this field. This revealed that EnzoInitialGrackleTest was still accessing this field (it wasn't doing anything of substance with it). Therefore, I deleted the mentions of the "gamma" field from EnzoInitialGrackleTest (we can always add it back in the future).
…as accidentally overwritten while addressing a merge conflict.
@mabruzzo Please merge this once ready; looks like very little left to do! |
…CTest infrastructure
@peeples, I had to introduce 2 new scripts to get this to work properly. I'm not sure whether @gregbryan or @brittonsmith would want to look at this again. I'm inclined to pull the trigger, since these new scripts effectively transcribe the logic that I had previously encoded in an SConscript file (and the logic should now be more readable). Plus, merging this in might help with wrapping up the multispecies cosmology PR (#143). But, I'll defer to your judgement. |
Let's merge, and if more issues arise, they can be fixed in a small-and-fast PR. Thanks! |
Overview
This PR introduces 2 automated tests for
EnzoMethodGrackle
. These are the first tests to check that Grackle-related results remain consistent across commits. In both of these tests, the code sets up parallel one-zone models without hydrodynamics. More details are provided about the test below:method_grackle_general
: This uses thenull
method to set the simulation timestep (each timestep extends over multiple heating/cooling times). After a fixed amount of simulation time, the data is written to disk and the newtools/field_summary.py
script is used to compute several summary statistics for relevant fields ("pressure"
,"temperature"
,"cooling_time"
) and compare them against previously computed reference values. The summary statistics include the mean, standard deviation, min, min-location, max, and max-location.method_grackle_cooling_dt
: This uses a similar setup to the first test, except that the timesteps are set to be a fixed fraction of the minimum magnitude of the heating/cooling time. This test checks the final simulation time, after a fixed number of cycles, against some reference value.An advantage of not setting the timestep in the first test to be a fixed fraction of the minimum heating/cooling timescale is that it increases the chance that we might witness race-conditions (if any exist or get introduced) when the simulation is run in SMP-mode.
More Details
Unfortunately these tests may be a little fragile given that Grackle is an external dependency and incremental changes to Grackle may slightly change results. Hopefully, we can adjust the tolerances to be flexible enough that this isn't a major problem (but the tests remain meaningful). In any case, I think these tests are definitely an improvement over having nothing.
The enzo-dev style of regression/answer testing would certainly be a superior approach (it would facilitate comparisons between two revisions of Enzo-E on the same machine using a single build of Grackle). However, the machinery behind that testing system remains opaque to me. It would be great if we could move to that system sooner rather than later. I expect that my creation of
tools/field_summary.py
was reinventing the wheel to some degree, and it would be great to rely on existing machinery. As an aside,tools/field_summary.py
could probably be reused in the short-term for the creation of other tests (e.g. the 1D Zel'dovich pancake tests mentioned in Issue #123)An obstacle to creating these tests was that the path to the Grackle data files can't be hard-coded into the input files. To get around this issue, I have introduced
tools/gen_grackle_testing_file.py
which writes a new input file to disk that simply has an include-directive (i.e. to include the contents ofinput/method_grackle_general.in
orinput/method_grackle_cooling_dt.in
and overwrites the value of theMethod:grackle:data_file
parameter. I choose to make this functionality andtools/field_summary.py
into a command-line tool because of my uncertainty over the future format of the testing system (giving the pending transition of the build-system from Scons to CMake).While implementing
test/MethodGrackle/SConscript
I broke parts of the test into individual build steps. I realize that may be somewhat difficult to follow. Let me know if you think I should refactor running the simulation and executingtools/field_summary.py
into a single build step for themethod_grackle_general
test.Removing
-ffast-math
from default buildThis PR was delayed by issues I encountered with reproducibility. The tests ran fine on different machines with the same compiler version (e.g. on my personal laptop and the CI infrastructure). However, we got surprisingly large differences (the biggest relative difference was maybe ~0.1% for double-precision tests) when we ran the test using a binary with a different compiler version (e.g. my work machine). In particular I had encountered issues with versions 7.5 and 9.4 of the gcc compiler.
It turns out that these issues arose because of the value-unsafe floating-point optimizations enabled by the
-ffast-math
flag (which was enabled by default onlinux-gnu
builds). It took me a while to figure this out since Grackle itself is never compiled with this flag. As far as I can tell, most of the differences creep in during initialization of fields inEnzoInitialGrackleTest
.As a consequence of this issue, I have removed the
-ffast-math
flag from the default build configuration. I don't think this should be controversial since we were already moving towards this change in PR #139. Removing this flag required the following modifications to existing tests:scons
build & test infrastructure withcmake
#139).mass
constant in the cosmology test from"double"
to"default"
. Without this change, the test crashed for binaries compiled with single precision (without the-ffast-math
). This problem was discussed in detail in Issueinput/Cosmology/method_cosmology
single-precision bugs #154.While setting the tolerances for the double-precision version of the
method_grackle_general
test, I took some care to make sure they were loose enough to avoid issues when running the test on my 2 machines with different compiler versions. I didn't take as much care for the single-precision tests (I figure we can loosen the tolerances on those tests as it becomes necessary)