-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefect 2.7.0 (cloud): Failed to get infrastructure for flow run #7796
Comments
Thank you very much for these references. Just some comments which may or may not be helpful at all. I apologize if they are not:
Thanks again for the link to the issues and PRs. |
Ah I see - #7442 looks similar.
I encountered the issue before 2.7.0 but also 3 days after upgrading to prefect 2.7.0. I think the retries itself might not have been enough to solve this 🤔 Maybe disabling http2 will prevent it from happening again. (Low Prio technical question: Do you have a suspicion why http2 might be an issue here?) |
These protocol errors are all specific to HTTP/2 semantics. We believe they are specifically related to improper handling of the GOAWAY frame by the h2 library. We're working with the httpcore/httpx maintainers to try to determine an upstream fix, but it's mostly out of our control. |
See #7442 |
First check
Bug summary
A flow is scheduled once every 15 minutes. It uses a work queue with concurrency set to 1. Sometimes (~ only once per week -> ~ once per 700 runs) one of the flow-runs gets stuck in pending (which is problematic as it blocks all subsequent flow runs due to concurrency setting).
Reproduction
Error
Versions
Agent:
Job: Kubernetes Job
The text was updated successfully, but these errors were encountered: