Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Adjust pool size in runtime #243

Open
showkeyjar opened this issue Jul 13, 2022 · 4 comments
Open

[Request] Adjust pool size in runtime #243

showkeyjar opened this issue Jul 13, 2022 · 4 comments

Comments

@showkeyjar
Copy link

I'm glad that pathos is still active, and I have a new advice:

would you please provider a new feature? users can adjust their pool size in runtime.

for example: in jupyter notebook or jupyterlab, when users process data, they must estimate their process number, but it always smaller than it can be,
so its waste time to try again and again,
if we can have a runtime adjust capability, it‘ll great helpfull for those jobs.

@mmckerns
Copy link
Member

@showkeyjar: Thanks for making a request. I'm not sure what you mean, can you explain a bit more? At what point in the workflow do you want to change the size of the Pool? Generally the workflow goes like this:

  1. p = Pool(4) Build a pool with a fixed number of workers
  2. res = p.uimap(lambda x:x*x, range(4)) Execute the jobs on the workers
  3. list(res) Get the results from the workers
  4. p.close(); p.join(); p.clear() Shut down the pool

At what stage would you want to change the size of the pool, and how exactly?

@showkeyjar
Copy link
Author

showkeyjar commented Jul 14, 2022

this is a simple solution example:

  1. p = AutoScalePool(scale_step=1, max_memory=0.8, max_cpu_load=45) build a pool to estimate number of workers
  2. res = p.uimap(lambda x:x*x, range(4)) Execute first batch on the min of workers, and then record the memory and cpu usage. and then at second batch, the workers number = prev + scale_step, if the memory is full then stop increase,and vice versa.

so, if the server has other tasks, it can never crashed, and can still execute and maximized use server resource

@mmckerns
Copy link
Member

It's not clear how you envision scale_step=1, max_memory=0.8, max_cpu_load=45 working. Is max_memory checked after the prior batch or workers has run, and then predicts what will keep memory under 80% and "adjusts" the size of the next batch? If so, I'd expect max_cpu_load to be similar, but here you have 45. I also don't understand what the point of scale_step is... is it that you are allowing the pool size to change by at most scale_step each new batch?

@mmckerns mmckerns changed the title [New feature advice]Adjust pool size in runtime [Request] Adjust pool size in runtime Jul 14, 2022
@showkeyjar
Copy link
Author

showkeyjar commented Jul 14, 2022

yes, you are right, it just a shallow plan example, you can design it according to actual conditions.

max_cpu_load usually no use on the server who has swap, if many process running, it only cause server run slowly.

but max_memory will have trouble when memory use out on Linux, it'll kill process randomly.

if you can check free memory at every batch, it's also no use for max_memory

I'm very excited that this feature like automatic transmission in car.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants