Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use resourceLimits and add test runs #53

Merged
merged 9 commits into from
May 29, 2024
Merged

Conversation

jashapiro
Copy link
Member

This started with just changing to using the new process.resourceLimits to replace our functions for checking memory and cpus against the maximum allowances.

One thing that was a bit non-obvious was how to handle the mem_max label: I ended up just picking a memory amount larger than any max value we are likely to use in the directive, but this is automatically scaled down to whatever is set in the process.resourceLimits setting.

The one disadvantage of this setup is that it is a bit harder to change limits on the fly; you can't just set a param and have that propagate in as we had before, but I don't expect that to be a common use case.

After I added the resourceLimits changes, I thought this was a good chance to add some more testing, so I included a new GitHub action for stub runs of the workflow.

@jashapiro jashapiro requested a review from allyhawkins May 29, 2024 17:41
Copy link
Member

@allyhawkins allyhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's a bit annoying to not be able to set the options at the command line, but if we come across running a lot of processes that use > 512 GB then we can revisit this. I think that's highly unlikely though. I just had a few small comments.

Comment on lines 4 to 6
push:
branches:
- jashapiro/resourceLimits
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that we should remove this before merging?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this & the lingering params.max_memory

Should be ready to go now.

withLabel: mem_max {
// max memory for tasks that have failed more than twice for OOM
// set to 2.TB, but will be reduced by process.resourceLimits
memory = {(task.attempt > 2 && task.exitStatus in 137..140) ? 2.TB : 64.GB * task.attempt }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that I think it's fine to keep this set at 2 TB. Although if you are going to have to actually change the config files to change the resource limits anyways, why not just set this to whatever the resource limit is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I figured it was easier to have only one place to change/update. There is a small risk we will forget that there is an implicit 2TB max, but if we are running into that, we probably have other problems as well!

nextflow.config Outdated
max_cpus = 8
max_memory = 32.GB
}
params.max_memory = 16.GB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you are actually using this anymore?

@jashapiro jashapiro requested a review from allyhawkins May 29, 2024 18:54
Copy link
Member

@allyhawkins allyhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jashapiro jashapiro merged commit 368b91f into main May 29, 2024
2 checks passed
@jashapiro jashapiro deleted the jashapiro/resourceLimits branch May 30, 2024 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants