Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying memory usage per task? #19

Open
lgarrison opened this issue May 10, 2021 · 3 comments
Open

Specifying memory usage per task? #19

lgarrison opened this issue May 10, 2021 · 3 comments

Comments

@lgarrison
Copy link
Member

disBatch has worked great for me for dealing with heterogeneous tasks that take different amounts of CPU time. But I'm now facing a set of jobs with heterogeneous memory usage, and I'm stuck requesting a very conservative number of jobs-per-node so that no node blows out its memory usage. I can estimate each job's memory usage, and I'm wondering if it would be possible to communicate this information to disBatch so that it would only dispatch a job to a node when it has enough memory for it. I guess the syntax might be something like:

#DISBATCH MEM 8GB
job1
job2
#DISBATCH MEM 20GB
job3
etc...

disBatch would also need to know about the available memory on each node, which I guess it could learn through Slurm. For non-Slurm backends, maybe it could be specified manually.

Do you think a feature like this makes sense for disBatch? I'd be happy to take a stab at this feature if you think so!

@njcarriero
Copy link
Collaborator

njcarriero commented Jun 23, 2021

Thanks for the suggestion.

This is something that I have been thinking about for a while (and the related variable number of cores).

I haven't come up with a solution that doesn't involve reinventing a non-trivial portion of a resource manager. If you think you have a good idea, go for it. FYI, the dynamicdb branch will soon(-ish) become release 2.0.

In a pinch, a user can clump small tasks together via shell ops, e.g. using a task like " job1 & job2 ; wait ". But that defeats one design goal which was to provide per-task record keeping.

@lgarrison
Copy link
Member Author

Thanks; I didn't have an implementation immediately in mind, and I agree that one doesn't want to reinvent a resource manager! In the last few weeks I think my need for this feature has lessened, but I may return to this in the future. The job-chaining idea may help too; thanks for the suggestion.

@xiuliren
Copy link

I was looking for this feature as well. My pipeline consists of over ten tasks with different resource requirements. I am currently running them manually one by one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants