Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump initial memory and cpu request for CellAssign #607

Merged
merged 3 commits into from
Dec 7, 2023

Conversation

allyhawkins
Copy link
Member

In attempting to run the first two projects through CellAssign, I am noticing a few things:

  • The first is that most of the samples are failing in the initial two attempts for running CellAssign, and are more successful on the third attempt (although not always). This tells me we should be initially requesting more memory. I think I would also bump up the cpus so that hopefully this takes less time and is less likely to get killed.
  • The second is that, every run has been killed by the external system after ~5-7 hours. In some cases, CellAssign has completed and in others none of the processes have completed. I'm not entirely sure what else we can do other than bump up requests? I wonder if there is a time out that's happening?

Below is a copy of the error that I'm consistently getting:
Screenshot 2023-12-07 at 9 38 51 AM

I've also attached the log file so that you can see the complete output.
nextflow.log

@allyhawkins allyhawkins requested a review from jashapiro December 7, 2023 15:41
Copy link
Member

@jashapiro jashapiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested being a bit more conservative with CPU requests, but other than that, this looks fine.

The jobs getting killed could be because these are spot instances, and AWS may be killing them randomly. The best way around that is to make them run faster, so upping the CPUs is probably the right move for that as well. But I am still wary of jumping too high on the CPU count if the payoff isn't there.

label 'mem_32'
label 'cpus_12'
label 'mem_128'
label 'cpus_32'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried that increasing both threads and memory at the same time will mean that you will still hit memory limits. So I might be more conservative here and only up the cpus to 24 for now? Also multithreading tends to be less efficient with more cores, depending how well it is implemented.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay yea I wasn't sure how much to bump it up to. I'll change it to 24 and then give it a go 🤞

@allyhawkins allyhawkins merged commit 768c207 into main Dec 7, 2023
3 checks passed
@allyhawkins allyhawkins deleted the allyhawkins/bump-cellassign-memory branch December 7, 2023 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants