[Request] Adjust pool size in runtime #243

showkeyjar · 2022-07-13T01:50:09Z

I'm glad that pathos is still active, and I have a new advice:

would you please provider a new feature? users can adjust their pool size in runtime.

for example: in jupyter notebook or jupyterlab, when users process data, they must estimate their process number, but it always smaller than it can be,
so its waste time to try again and again,
if we can have a runtime adjust capability, it‘ll great helpfull for those jobs.

mmckerns · 2022-07-13T10:48:17Z

@showkeyjar: Thanks for making a request. I'm not sure what you mean, can you explain a bit more? At what point in the workflow do you want to change the size of the Pool? Generally the workflow goes like this:

p = Pool(4) Build a pool with a fixed number of workers
res = p.uimap(lambda x:x*x, range(4)) Execute the jobs on the workers
list(res) Get the results from the workers
p.close(); p.join(); p.clear() Shut down the pool

At what stage would you want to change the size of the pool, and how exactly?

showkeyjar · 2022-07-14T06:16:25Z

this is a simple solution example:

p = AutoScalePool(scale_step=1, max_memory=0.8, max_cpu_load=45) build a pool to estimate number of workers
res = p.uimap(lambda x:x*x, range(4)) Execute first batch on the min of workers, and then record the memory and cpu usage. and then at second batch, the workers number = prev + scale_step, if the memory is full then stop increase，and vice versa.

so, if the server has other tasks, it can never crashed, and can still execute and maximized use server resource

mmckerns · 2022-07-14T10:45:29Z

It's not clear how you envision scale_step=1, max_memory=0.8, max_cpu_load=45 working. Is max_memory checked after the prior batch or workers has run, and then predicts what will keep memory under 80% and "adjusts" the size of the next batch? If so, I'd expect max_cpu_load to be similar, but here you have 45. I also don't understand what the point of scale_step is... is it that you are allowing the pool size to change by at most scale_step each new batch?

showkeyjar · 2022-07-14T11:26:12Z

yes, you are right, it just a shallow plan example, you can design it according to actual conditions.

max_cpu_load usually no use on the server who has swap, if many process running, it only cause server run slowly.

but max_memory will have trouble when memory use out on Linux, it'll kill process randomly.

if you can check free memory at every batch, it's also no use for max_memory

I'm very excited that this feature like automatic transmission in car.

mmckerns changed the title ~~[New feature advice]Adjust pool size in runtime~~ [Request] Adjust pool size in runtime Jul 14, 2022

ddelange mentioned this issue Sep 10, 2023

Dynamic workers based on RAM usage ddelange/mapply#44

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Request] Adjust pool size in runtime #243

[Request] Adjust pool size in runtime #243

showkeyjar commented Jul 13, 2022

mmckerns commented Jul 13, 2022

showkeyjar commented Jul 14, 2022 •

edited

Loading

mmckerns commented Jul 14, 2022

showkeyjar commented Jul 14, 2022 •

edited

Loading

[Request] Adjust pool size in runtime #243

[Request] Adjust pool size in runtime #243

Comments

showkeyjar commented Jul 13, 2022

mmckerns commented Jul 13, 2022

showkeyjar commented Jul 14, 2022 • edited Loading

mmckerns commented Jul 14, 2022

showkeyjar commented Jul 14, 2022 • edited Loading

showkeyjar commented Jul 14, 2022 •

edited

Loading

showkeyjar commented Jul 14, 2022 •

edited

Loading