Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more loss types and options #385

Merged
merged 5 commits into from
Nov 12, 2024
Merged

Add more loss types and options #385

merged 5 commits into from
Nov 12, 2024

Conversation

frostedoyster
Copy link
Collaborator

@frostedoyster frostedoyster commented Nov 10, 2024

This PR implements more loss options. Now the user can choose the type of the loss (rmse, mae, huber) as well as the reduction used (sum or mean).

Contributor (creator of pull-request) checklist

  • Tests updated (for new features and bugfixes)?
  • Documentation updated (for new features)?
    - [ ] Issue referenced (for PRs that solve an issue)?

📚 Documentation preview 📚: https://metatrain--385.org.readthedocs.build/en/385/

Copy link
Contributor

@PicoCentauri PicoCentauri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful addition. I hope our config doesn't get too complicated with all these nested options. I suggested one shorthand notation which might be useful.

and 0.1 for the forces, you can set the following in the ``options.yaml`` file:
``loss_weights: {"energy": 1.0, "forces": 0.1}``.
- ``loss``: This section describes the loss function to be used, and it has three
subsections. 1. ``weights``. This controls the weighting of different contributions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe give the default value here?

Comment on lines 31 to 32
weights: {}
type: mse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would switch the type and the weight dictionary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Comment on lines 210 to 211
reduction=self.hypers["loss"]["reduction"],
type=self.hypers["loss"]["type"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sens to design the function is such a way that you simply pass

loss_fn = TensorMapDictLoss(**self.hypers["loss"])

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll try to do it

"oneOf": [
{
"type": "string",
"enum": ["mse", "mae"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't "huber" missing here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"huber" is actually an object (not a string) and it's allowed, just a few lines below these

log_mae: False
loss:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also just allow

loss: mse

and exand this to

loss:
    type: mse
    weights: {}
    reduction: sum

This should help usability in is in line of what we do for the datasets.

Copy link
Collaborator Author

@frostedoyster frostedoyster Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is confusing for the user, because then the hypers change dynamically. Since we give the user the full template of the hypers to rely on, I would keep the format exactly as in the default hypers template (at least for now)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @PicoCentauri here, having a smaller version of the same input is nice, especially if people are not just copy-pasting our templates around. We are already doing this for datasets:

training_set: "dataset.xyz"

Actually means

training_set:
    systems:
        read_from: dataset.xyz
        reader: ase
        length_unit: null
#        ...

Copy link
Collaborator Author

@frostedoyster frostedoyster Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-pasting the templates is the only way for users to modify the hypers... they can't make them up. This means that this feature, which is hard to implement, test and maintain, won't be used much

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying that the overhead for this feature just isn't worth it, considering that the projected gain is from

loss:
  type: mae

to

loss: mae

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a lot more, but it is more consistent with other parameters, and nicer to use when writing the input by hand.

I haven't checked the code complexity, but this also looks like a fairly easy thing to implement, so the implementation overhead should be very manageable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I think it opens too many possibilities and, from what I can see, it's not easy to test and maintain. Like in programming languages, having many ways of achieving the same thing is not always an advantage. I will open the issue though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why the possibilities opened are too many, could you elaborate? Test & maintenance should also be trivial: if the type of loss is a string, you transform to loss: {"type": <value>} and then continue as is. Am I missing something here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could understand if we where trying to be very explicit everywhere, but In general, metatrain options.yaml already includes a lot of syntactic sugar, from default hypers and with the datatset handling; and I really don't see how the loss would be any different.

@frostedoyster
Copy link
Collaborator Author

I would merge by the end of today if there are no objections, because we need this feature for some work and the outstanding point is more of a general design issue. I will make sure to open an issue about it

@frostedoyster frostedoyster merged commit 36de384 into main Nov 12, 2024
12 checks passed
@frostedoyster frostedoyster deleted the mae-loss branch November 12, 2024 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants