-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
autoscale / allocate jobs based on metrics use case #118
Comments
Sounds like you are issuing jobs for a specific worker? If not you can start many workers as you want and they will scale up and down automatically. The only question is how to handle in flight jobs as today I don’t support a signal that says complete current job and exit rather than complete current job and get next one? how you handle auto scaling per se is out of the control of this tool though afaik? |
thanks for asking.. sure your pretty busy. I would like to try to some experience with asyncjobs and this seems like a really useful one too.
yes. The autoscaler is this: https://github.com/woodpecker-ci/autoscaler Hetzner robot in included that talks to Hetzner IAAS to control VM's
yes i agree. In this case, it's a single run job style task. Run and when done, please die basically. That would mean i don't have to do any rebalancing because we are basically doing the Serverless pattern, which is a smarter pattern i think :) Makes me think of google cloud run, where it only dies if there has not been all http calls for the last 5 minutes. I think these 2 logic patterns are the way to do it. Its simpler than what i proposed originally.
To calculate the auto scaling, https://github.com/woodpecker-ci/autoscaler/blob/main/main.go#L23 holds a reference to all the agents. I am realising that the way to build this is to have a dummy VM Pool locally, to simulate a Hertzner VM. |
https://github.com/windsource/picus does that same logic and uses woodpecker ci. |
I have a use case where i need to run long running jobs on hetzner where the hetzner robot allows me to add and remove vms.
So it will allow me to autoscale, on cheap hardware.
In order to do this i need to detect RAM and CPU usage on each server where the async jobs agent runs. Detection is pretty easy.
Logic is:
so that the logic of the 2 use cases can be don, NATS KV would be an easy win.
If each agent sends metrics every 1 minute, and NATS KV flushes KV on a TTL of 1 hour, it will self run.
Then core can issue jobs based on these metrics.
The text was updated successfully, but these errors were encountered: