Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too frequent Client.Timeout #1839

Closed
liubog2008 opened this issue Dec 15, 2021 · 6 comments · Fixed by #2383
Closed

Too frequent Client.Timeout #1839

liubog2008 opened this issue Dec 15, 2021 · 6 comments · Fixed by #2383
Assignees
Labels
customer/feedback Feedback from customers impact/reliability Something that feels unreliable or flaky kind/enhancement Improvements or new features resolution/fixed This issue was fixed
Milestone

Comments

@liubog2008
Copy link

liubog2008 commented Dec 15, 2021

Hello!

  • Vote on this issue by adding a 👍 reaction
  • To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already)

Issue details

Too many Client.Timeout when I run pulumi up. A global CLI timeout option is needed.

I don't know whether the problem is because of uncached(only memory cached) openapi schema. Some error outputs:

configured Kubernetes cluster is unreachable: unable to load schema information from the API server: unexpected error when reading response body. Please retry. Original error: net/http: request canceled (Client.Timeout or context cancellation while reading body)

Maybe disk cached openapi schema is needed (use same cache of kubectl?)

By the way, if I add --refresh it will always timeout.

Steps to reproduce

Expected:

  1. Global CLI timeout flag.
  2. Disk cached openapi schema discovery.

Actual:

@liubog2008 liubog2008 added the kind/bug Some behavior is incorrect or out of spec label Dec 15, 2021
@mikhailshilkov
Copy link
Member

@liubog2008 Do you have a code snippet that I could try to reproduce this error?

@mikhailshilkov mikhailshilkov added the awaiting-feedback Blocked on input from the author label Dec 16, 2021
@liubog2008
Copy link
Author

Maybe you should have a bad network and install many CRDs in kubernetes.

@liubog2008
Copy link
Author

It's worked fine if network is quick.

@mikhailshilkov mikhailshilkov added impact/reliability Something that feels unreliable or flaky and removed awaiting-feedback Blocked on input from the author labels Dec 16, 2021
@liubog2008
Copy link
Author

@mikhailshilkov Any progress?

@tusharshahrs
Copy link

tusharshahrs commented Apr 24, 2023

Adding relevant notes from #2365 above that was closed.
Default timeout of 32s comes from here: https://github.com/kubernetes/client-go/blob/8005e0d28ba2db8a60bb8dbe8a5a01fed4872f67/discovery/discovery_client.go#L55
defaultTimeout = 32 * time.Second

  • Where does this timeout value come from?
  • Is there a way to increase it via environment variables? There don’t appear to be any provider-specific toggles.
  • From what I did gather while scanning the k8s.io/client-go source, it has a fairly constrained approach to automatically retrying requests. I’m not sure that it automatically retries timeouts, so the backoff base and duration that I mentioned above may not apply. I think the solution to this issue involves two changes:
  1. The pulumi-kubernetes provider should expose an optional parameter on the provider enabling it to override the default timeout value set by k8s.io/client-go.

  2. If automatic retry logic does not exist for query/idempotent requests, retry logic should be added. If retry logic does exist, perhaps its behavior could be exposed as parameters on the provider (i.e. # of retry attempts, backoff base, backoff duration)

  3. Not sure what the negative implications of this would be, if any. There also seem to be two environment variables we could leverage deep in the k8s.io/client-go package: KUBE_CLIENT_BACKOFF_BASE and KUBE_CLIENT_BACKOFF_DURATION.

@lukehoban lukehoban added kind/enhancement Improvements or new features and removed kind/bug Some behavior is incorrect or out of spec labels Apr 27, 2023
@pulumi-bot pulumi-bot added the resolution/fixed This issue was fixed label May 11, 2023
@lukehoban lukehoban added this to the 0.88 milestone May 15, 2023
@PRIHLOP
Copy link

PRIHLOP commented Jan 10, 2025

I agree with @tusharshahrs , he described the requirements well. But, only the first point is implemented and not solve all issues about k8s cluster connection.
I think, the second point of @tusharshahrs suggestion also requires implementation.
Because at now I have an issue with not configurable retries to connect to Kubernetes API.

If automatic retry logic does not exist for query/idempotent requests, retry logic should be added. If retry logic does exist, perhaps its behavior could be exposed as parameters on the provider (i.e. # of retry attempts, backoff base, backoff duration)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer/feedback Feedback from customers impact/reliability Something that feels unreliable or flaky kind/enhancement Improvements or new features resolution/fixed This issue was fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants