-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit the compute job thread pool size #6624
base: dev
Are you sure you want to change the base?
Conversation
The router has always observed the APOLLO_ROUTER_NUM_CORES environment variable to restrict the size of the main tokio async job scheduler. We are now enhancing the compute job thread pool to both respect this environment variable and restrict the number of threads in the pool. If the environment variable is not set, then the size of the pool is computed as a fraction of the total number of cores that the router has determined are available. If it is set, then the environment variable is taken as the number of available cores. From this number, let's call it available, the router then uses the following table to size the compute job thread pool: /// available: 1 pool size: 1 /// available: 2 pool size: 1 /// available: 3 pool size: 2 /// available: 4 pool size: 3 /// available: 5 pool size: 4 /// ... /// available: 8 pool size: 7 /// available: 9 pool size: 7 /// ... /// available: 16 pool size: 14 /// available: 17 pool size: 14 /// ... /// available: 32 pool size: 28 /// etc... This table should not be relied upon as an explicit interface, since it may change in the future, but is provided here for informational purposes.
✅ Docs preview has no changesThe preview was not built because there were no changes. Build ID: a4d6d55042cffd8b0dab3381 |
This comment has been minimized.
This comment has been minimized.
CI performance tests
|
I am going to try backporting it to 1.x so I can run some more perf tests. |
@Mergifyio backport 1.x |
🟠 Waiting for conditions to match
|
Any chance you would be willing to incorporate the spirit of #6663 into this PR and log out the effective pool size and queue for diagnostics/info? As a customer, I'd really appreciate having this available and be able to see the reality of these fairly important structures, based on the amount of cores dialed in for router ie, in kubernetes deployments. |
The router has always observed the APOLLO_ROUTER_NUM_CORES environment variable to restrict the size of the main tokio async job scheduler.
We are now enhancing the compute job thread pool to both respect this environment variable and restrict the number of threads in the pool.
If the environment variable is not set, then the size of the pool is computed as a fraction of the total number of cores that the router has determined are available.
If it is set, then the environment variable is taken as the number of available cores.
From this number, let's call it available, the router then uses the following table to size the compute job thread pool:
/// available: 1 pool size: 1
/// available: 2 pool size: 1
/// available: 3 pool size: 2
/// available: 4 pool size: 3
/// available: 5 pool size: 4
/// ...
/// available: 8 pool size: 7
/// available: 9 pool size: 7
/// ...
/// available: 16 pool size: 14
/// available: 17 pool size: 14
/// ...
/// available: 32 pool size: 28
/// etc...
This table should not be relied upon as an explicit interface, since it may change in the future, but is provided here for informational purposes.