Change `atomic_types` and `gradients` from Sets to Unique Lists #296

PicoCentauri · 2024-07-13T08:32:51Z

This pull request changes the types of the atomic_types and gradients properties in the DatasetInfo and TargetInfo classes from sets to (unique) lists.

These properties were originally designed as sets because they represent collections of unique items. However, @frostedoyster raised constant concerns about the difficulty from him working with sets in certain contexts, particularly with functions and classes that expect lists. This change aims to resolve those issues.

The atomic_types and gradients are still stored internally as sets to maintain uniqueness, but they are exposed as sorted lists for compatibility. This change has introduced additional methods and complexity to the classes which can be easily seen by number of added lines in this PR. For instance, the code had to be adjusted in various places, such as changing:

new_atomic_types = merged_info.atomic_types - self.dataset_info.atomic_types

to a much more nested style

new_atomic_types = sorted(
    set(merged_info.atomic_types) - set(self.dataset_info.atomic_types)
)

I must express that I am very unhappy with these changes. I think changing the types is very bad design choice. The properties atomic_types and gradients should clearly be sets due to their nature of representing unique items. Lists do not have these properties. This change, in my opinion, compromises the clarity and integrity of the code. However, I have made these changes to accommodate the concerns raised and to facilitate smoother development.

@frostedoyster, please review these changes to ensure that they fix your issues in #286. I expect that you (1) add a clear and not messy test in this PR to ensure that your log files are rendered in exactly the order you want them. (2) I encourage you to also to check and add a test for the continuation of your SOAP BPNN model when changing types. Scrolling through your test in test_continue.py, I saw no tests checking this functionality...

Contributor (creator of pull-request) checklist

Tests updated (for new features and bugfixes)?
Documentation updated (for new features)?
Issue referenced (for PRs that solve an issue)?

📚 Documentation preview 📚: https://metatrain--296.org.readthedocs.build/en/296/

ceriottm · 2024-07-13T10:03:43Z

Sorry to intervene in what looks like a discussion that has been ongoing for a while, but what is the issue in wrapping a list() before passing these to a function that expects a list? Also, what are the implications in terms of overhead, both of storing elements as sets and of wrapping sets into a sorted() by default?

PicoCentauri · 2024-07-13T13:35:52Z

Good question. I would say it should work. At least this is what we do for each model. They can just store their atomic types and gradients as list converted from the sets.

frostedoyster · 2024-07-16T08:00:55Z

Thanks @PicoCentauri. I tested by hand the fact that it works. However, I couldn't find a way to write a test for the order in the log, because we don't have a reliable dataset with stresses in our test suite. I will open an issue for that

Change atomic_types and gradients from Sets to Unique Lists

c22a2aa

PicoCentauri requested review from DavideTisi, spozdn, abmazitov and frostedoyster as code owners July 13, 2024 08:32

frostedoyster linked an issue Jul 16, 2024 that may be closed by this pull request

Multiple gradient targets are printed in inconsistent order #286

Closed

frostedoyster and others added 2 commits July 16, 2024 10:01

Fix small issues

2b5cd10

Merge branch 'main' into dataclass-lists

a447a0f

frostedoyster approved these changes Jul 16, 2024

View reviewed changes

frostedoyster mentioned this pull request Jul 16, 2024

Add better dataset with stresses (or virials) #302

Closed

frostedoyster merged commit 576b2b0 into main Jul 16, 2024
14 checks passed

frostedoyster deleted the dataclass-lists branch July 16, 2024 08:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change `atomic_types` and `gradients` from Sets to Unique Lists #296

Change `atomic_types` and `gradients` from Sets to Unique Lists #296

PicoCentauri commented Jul 13, 2024 •

edited by github-actions bot

Loading

ceriottm commented Jul 13, 2024

PicoCentauri commented Jul 13, 2024

frostedoyster commented Jul 16, 2024

Change atomic_types and gradients from Sets to Unique Lists #296

Change atomic_types and gradients from Sets to Unique Lists #296

Conversation

PicoCentauri commented Jul 13, 2024 • edited by github-actions bot Loading

Contributor (creator of pull-request) checklist

ceriottm commented Jul 13, 2024

PicoCentauri commented Jul 13, 2024

frostedoyster commented Jul 16, 2024

Change `atomic_types` and `gradients` from Sets to Unique Lists #296

Change `atomic_types` and `gradients` from Sets to Unique Lists #296

PicoCentauri commented Jul 13, 2024 •

edited by github-actions bot

Loading