Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rely on API Priority and Fairness (APF) instead of client-side rate limiting #2933

Open
Sijoma opened this issue Aug 26, 2024 · 5 comments
Open

Comments

@Sijoma
Copy link

Sijoma commented Aug 26, 2024

When relying on controller-runtime defaults, it's difficult to spot whether a controller is rate-limited on the client-side. This can have a negative impact on controller performance as the reconcile loop will be slowed down.

The recommendation from slack discussions seems to be to always:

  1. set QPS to -1 everywhere and rely on APF
  2. use the cache wherever possible

One of the concerns was that this might not be safe default when users are relying on older Kubernetes versions. I'm not sure how those older Kubernetes versions would work with an updated controller-runtime in any case? Isn't the client too new then?

Discussion links:

lavalamp: even if it works the right way you should still turn it off and rely on APF

Link to sig-apimachinery recommendation: https://kubernetes.slack.com/archives/C0EG7JC6T/p1680796612287719?thread_ts=1680791299.631439&cid=C0EG7JC6T

Controller Runtime discussion - https://kubernetes.slack.com/archives/C02MRBMN00Z/p1724224928349419

@sbueringer
Copy link
Member

sbueringer commented Aug 26, 2024

I'm not sure how those older Kubernetes versions would work with an updated controller-runtime in any case? Isn't the client too new then?

Controller-runtime in general is compatible with a wide-range of Kubernetes versions on the server-side. Otherwise folks would have to make sure the controller they use use exactly the same Kubernetes version as the server. An example is Cluster API which usually is compatible with ~ 6-7 Kubernetes versions (https://cluster-api.sigs.k8s.io/reference/versions#core-provider-cluster-api-controller).

Of course folks have to be careful about which features of the kube-apiserver and builtin APIs they rely on, but controller-runtime itself tries to not depend on any specific new kube-apiserver feature.

@Sijoma
Copy link
Author

Sijoma commented Sep 6, 2024

Seems like Crossplane is also running into issues around these settings:

https://github.com/crossplane/crossplane/pull/5742/files

@nathanperkins
Copy link

nathanperkins commented Oct 29, 2024

Thanks for bringing up this discussion! We were having problems with backed up workqueues in some of our controllers and disabling the client-side rate limiter in favor of APF took our latency from O(hours) to O(seconds).

If it makes sense, we would love to see related options in controller-runtime client/manager rather than having to modify the *rest.Config before creating the client, but the current solution is workable for us :)

@vincepri
Copy link
Member

This is generally a good idea, if it has to be in controller runtime, which we could support, needs to be able to determine if APF is enabled on the server, or fallback gracefully.

@sbueringer
Copy link
Member

Not sure if we really should add additional options to Client / Manager that would overwrite the corresponding options of the rest.Config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants