Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

federation-wide repo quotas #2143

Open
minrk opened this issue Mar 8, 2022 · 3 comments
Open

federation-wide repo quotas #2143

minrk opened this issue Mar 8, 2022 · 3 comments

Comments

@minrk
Copy link
Member

minrk commented Mar 8, 2022

We have per-repo quotas, but these are applied separately at each federation member. This means that a per-repo limit of 100 can actually take up 100 slots on each federation member, or over 60% of our whole federation capacity right now. This wouldn't be a huge issue if our per-repo limit were significantly less than each federation member's capacity, but each federation member only allows 80-200 slots. This becomes more of an issue now that GKE has a limit, which is effectively increasing the repo quota on repos assigned to GKE any time GKE itself is considered full.

To properly apply limits, we need federated quotas, counting sessions for repos across all members, to use for applying limits. That sounds complicated, but I don't know another way

@betatim
Copy link
Member

betatim commented Mar 8, 2022

Trying to think of ways to avoid solving a complicated problem: could we lower the per repo quota from 100 to 30? Trying to get back into the regime of "the per repo quote is much smaller than the capacity of a federation member".

We'd have to find out how many concurrent sessions the "top repos" have to find out if 30 is too low or not. I think we picked the 100 limit more or less out of thin air: "seems about right and like no one would ever run into it".

@manics
Copy link
Member

manics commented Mar 8, 2022

Alternatively if we change the quota to launches per hour and ignore how long the pods are running for we could avoid querying the K8s API across all members. Instead we could run a single service that records the number of launches as a counter with a TTL:
https://github.com/jupyterhub/binderhub/blob/98940b22d99ab6efa0c52b00dcb718eb7ac4282a/binderhub/builder.py#L562-L573
would make a HTTP request to the service to

  • check whether the quota is exceeded
  • update the counter if it's not

Since this is a single HTTP request with no backend K8s calls it should be quick

@minrk
Copy link
Member Author

minrk commented Mar 8, 2022

I implemented two drafts:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants