-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Fail Safe" control knob for Extension Server #4155
Comments
we've fail closed by default for all Policy APIs, should we do the same for extension manager (if unable to connect or unable to get a response in time ) ptal @envoyproxy/gateway-maintainers . If unable to connect during startup, it would mean no config would ever be programmed in envoy proxy, and if the gRPC request times out, we could skip that xDS update, and the data plane would have the last good config |
Maybe add a configuration flag in the extension manager configuration section to specify if the extension manager should fail-open or fail-close? |
yeah thats a good home to add the mode config if user expectations vary, we still need to figure out what the default mode should be - fail open or fail close 😄 I vote for fail close by default, to ensure we fail fast and early like we do for Filters and Policies |
if i understand correctly, since the extension has several hooks (HTTPListener, VirtualHost, Route, and Translation), it would be make sense if 500 behavior is used when Route hook failing. But what about on other hooks? |
I'm not sure about the others, but in the case of HTTPListener, in "failsafe mode" the listener should not be added to list of active listener resources if the hook fails. |
Currently, the processing of Extension Server logic (ie
Translate()
step) is "best effort". If the Extension Server fails in some way (ie is not available, crashes during request handling), than the envoy configuration is not impacted.In some situations, this could be a "bad thing". For example, if the Extension Server is being used to add a default Authz filter to all Listeners, if the grpc call to the Extension Server fails, than the Listener will still be activated but will not have the Authz filter (thus incorrectly exposing the resource without the desired protection).
Ideally, similar to #3873, there would be an option added to
ExtensionManager
which would allow either "fail open" (current behavior of best effort) or "fail closed" (alternate behavior of disabling the resource associated with the failed hook).The text was updated successfully, but these errors were encountered: