Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hashing #121

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Fix hashing #121

wants to merge 2 commits into from

Conversation

ElliottKasoar
Copy link
Contributor

@ElliottKasoar ElliottKasoar commented Jan 10, 2025

  • Fixes the hasher definition, as discussed in the issue.

    • I've removed the second creation of a hasher object, which means the overall hash is similar to the erroneous behaviour before, in that the structure hash contributes to the final hash.
  • Also updates the deprecated utcnow, and tidies the code slightly.

Edit: hash_strucutre seems fixed in test_write_and_read, but actually I'm not sure the example in the issue quite works yet, so I'll leave this as a draft for now.

@ElliottKasoar ElliottKasoar marked this pull request as draft January 10, 2025 18:33
@ElliottKasoar
Copy link
Contributor Author

ElliottKasoar commented Jan 10, 2025

Ah, so @stenczelt, while this does fix the hash_structure example currently in the tests, the example in the linked issue still fails, which must be because the (float) positions read in by ASE differ to the original random numbers:

  comparison failed. Mismatched elements: 8 / 9:
  Max absolute difference: 4.723046176380308e-09
  Max relative difference: 2.2429495787383598e-08
  Index  | Obtained   | Expected                     
  (0, 0) | 0.77395605 | 0.7739560485559633 ± 1.0e-10 
  (0, 1) | 0.43887844 | 0.4388784397520523 ± 1.0e-10 
  (1, 0) | 0.69736803 | 0.6973680290593639 ± 1.0e-10 
  (1, 1) | 0.09417735 | 0.09417734788764953 ± 1.0e-10
  (1, 2) | 0.97562235 | 0.9756223516367559 ± 1.0e-10 
  (2, 0) | 0.7611397  | 0.761139701990353 ± 1.0e-10  
  (2, 1) | 0.78606431 | 0.7860643052769538 ± 1.0e-10 
  (2, 2) | 0.12811363 | 0.12811363267554587 ± 1.0e-10

Since test_write_and_read only looks at integer positions, this doesn't get picked up.

Do we want to try truncating the precision used in the hash to try to fix this?

It feels slightly dangerous, but I'm not sure what the alternative is, unless we accept that this is a limitation.

@stenczelt
Copy link
Member

I agree it would be dangerous to truncate positions, especially since then it's not really a hash or checksum of the object but turning into some similarity metric (albeit a poor one).
As the ASE write & read ops are apparently destructive to some of the structure, we may not be testing what we are meaning to test, so can also change the test. Or perhaps if we have control over the float formatting of ASE then we could force it to not introduce any difference?

@ElliottKasoar
Copy link
Contributor Author

ElliottKasoar commented Jan 21, 2025

I agree it would be dangerous to truncate positions, especially since then it's not really a hash or checksum of the object but turning into some similarity metric (albeit a poor one). As the ASE write & read ops are apparently destructive to some of the structure, we may not be testing what we are meaning to test, so can also change the test. Or perhaps if we have control over the float formatting of ASE then we could force it to not introduce any difference?

For some file formats e.g. castep-cell there's a precision option, but I'm not sure we can control it for (ext)xyz files, or more generally.

Perhaps we stick with this for now as a starting point for fixing hashing, since this fixes a source of similar errors?

#118 isn't resolved, but I think the existing tests do all work in the way that you would expect, including the read/write example - I don't think it's wrong to test that the hash is preserved in that example, as long as we're aware that we can't guarantee it'll always be preserved for now.

@ElliottKasoar ElliottKasoar marked this pull request as ready for review January 22, 2025 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants