-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-48177: UPSTREAM: <carry>: disable etcd readiness checks by default #2174
base: master
Are you sure you want to change the base?
OCPBUGS-48177: UPSTREAM: <carry>: disable etcd readiness checks by default #2174
Conversation
@ingvagabund: This pull request references Jira Issue OCPBUGS-48177, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@ingvagabund: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
e3ba771
to
fb986c1
Compare
@ingvagabund: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
@ingvagabund: This pull request references Jira Issue OCPBUGS-48177, which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@ingvagabund: This pull request references Jira Issue OCPBUGS-48177, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
fb986c1
to
aed00d9
Compare
@ingvagabund: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
// This is a downstream patch only as OpenShift's way of using etcd is unique. | ||
readyzChecks := []healthz.HealthChecker{} | ||
for _, check := range healthChecks { | ||
if check.Name() == "etcd" || check.Name() == "etcd-readiness" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this patch shouldn't we exclude the desired check directly from kas-o ? (xref: https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/bindata/assets/kube-apiserver/pod.yaml#L111)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not properly checked though are all KA instances always exposed through service endpoints or a load balancer? Cloud provider LB health check paths are not expected to accept any queries/params like ?exclude=...
(AWS, Azure, GCP at least). Unless I misunderstood both openshit-apiserver and oauth-apiserver rely on k8s.io/apiserver/pkg/server code. I.e. https://github.com/openshift/openshift-apiserver/blob/master/go.mod#L197 and https://github.com/openshift/oauth-apiserver/blob/master/go.mod#L144.
Is https://github.com/openshift/kubernetes-apiserver expected to be in sync with https://github.com/openshift/kubernetes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure it will work because we already exclude etcd from the liveness probe (xref: https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/bindata/assets/kube-apiserver/pod.yaml#L105)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right. So this will work for KA instances and https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/bindata/assets/kube-apiserver/svc.yaml. What about o-a and oauth-a? I was told they are added to a LB. Through:
- https://github.com/kubernetes-sigs/cluster-api-provider-aws
- https://github.com/kubernetes-sigs/cluster-api-provider-azure
- https://github.com/kubernetes-sigs/cluster-api-provider-gcp
which OpenShift relies on when creating the LBs. It seems to be a pattern to ignore/reject any such exclusion in the LB's health check paths: https://issues.redhat.com/browse/OCPBUGS-48177?focusedId=26397412&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-26397412.
So even though the ?exclude...
could be used for KA instances, it will not work for o-a/oauth-a. Also, disabling the checks in the code directly instead of at every place /readyz is accessed leaves no space for "forgotten" cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about o-a and oauth-a?
We could also exclude the desired checks directly from the operators, for example:
https://github.com/openshift/cluster-openshift-apiserver-operator/blob/master/bindata/v3.11.0/openshift-apiserver/deploy.yaml#L122
I was told they are added to a LB
I don't think this is true. The extension API servers are proxied by the KAS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you are saying we do not rely on https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/pkg/cloud/services/elb/loadbalancer.go#L212?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do but I think only for KAS.
We could verify it. Please create a cluster on AWS and then log in to the AWS console and check the setting for the LB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me and Lukasz synced over a call to discuss this. Main points:
- kube-aggregator (as a proxy) discovers api services (through APIService objects) and registers a specific group to a set of api servers (e.g. openshift-apiserver, oauth-service)
- kube-aggregator service resolver consumes endpoints:
- https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/kube-aggregator/pkg/controllers/status/remote/remote_available_controller.go#L286C28-L286C43
- https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/kube-aggregator/pkg/apiserver/resolvers.go#L46
- https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/util/proxy/proxy.go#L73C25-L73C33
- endpoints controller takes into account whether a pod is ready or not.
- https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/bindata/assets/kube-apiserver/svc.yaml does not specify
.spec.publishNotReadyAddresses
so only ready pods are added to a list ofAddresses
(the rest to notReadyAddresses). - https://github.com/kubernetes/kubernetes/blob/cd5f3d9f9d5ae3153206178e6114d573dc24ad73/staging/src/k8s.io/apiserver/pkg/util/proxy/proxy.go#L87 reads only
.Addreses
field so not ready pods are excluded. - As a result both https://github.com/openshift/cluster-openshift-apiserver-operator/blob/8236e6ebd065e30dc479a770bb6674165f123f66/bindata/v3.11.0/openshift-apiserver/deploy.yaml#L132 and https://github.com/openshift/cluster-authentication-operator/blob/8538d46b59cca46f6b6987e0f16c946478204f06/bindata/oauth-apiserver/deploy.yaml#L123
/readyz
probes need to be updated to?exclude=etcd
.
- https://github.com/openshift/cluster-kube-apiserver-operator/blob/master/bindata/assets/kube-apiserver/svc.yaml does not specify
- LB sends traffic to a host network -> kube-api server instances
- meaning
/readyz
path is accessed directly (not via endpoints) - given AWS/Azure/GCP LB healthcheck path does not accept queries/params,
/readyz
needs to be patched within kube-apiserver code as is done in this PR.
- meaning
/retest-required |
I accept the argument for a patch like this, but are the test changes going to be painful to carry over time? Is there an alternative way to structure the test changes that reduces the risk that we someday break the test while resolving rebase conflicts? |
aed00d9
to
1b40bbf
Compare
@ingvagabund: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
1b40bbf
to
a6f4dc5
Compare
@ingvagabund: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
a6f4dc5
to
a8242d8
Compare
@ingvagabund: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
Explicitly exclude etcd and etcd-readiness checks (OCPBUGS-48177) and have etcd operator take responsibility for properly reporting etcd readiness. Justification: kube-apiserver instances get removed from a load balancer when etcd starts to report not ready (as will KA's /readyz). Client connections can withstand etcd unreadiness longer than the readiness timeout is. Thus, it is not necessary to drop connections in case etcd resumes its readiness before a client connection times out naturally. This is a downstream patch only as OpenShift's way of using etcd is unique.
a8242d8
to
30b315a
Compare
@ingvagabund: the contents of this pull request could not be automatically validated. The following commits could not be validated and must be approved by a top-level approver:
Comment |
Good point. I have updated the tests to excluded the checks in the main test loop rather than in every case. This will help to reduce the drift. |
/lgtm |
I think Ben can be a DOWNSTREAM_APPROVER. I'll label this one while he gets a PR to add himself. |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: benluddy, ingvagabund The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest-required |
3 similar comments
/retest-required |
/retest-required |
/retest-required |
/label acknowledge-critical-fixes-only |
2 similar comments
@ingvagabund: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Explicitly exclude etcd and etcd-readiness checks (OCPBUGS-48177) and have etcd operator take responsibility for properly reporting etcd readiness. Justification: kube-apiserver instances get removed from a load balancer when etcd starts to report not ready (as will KA's /readyz). Client connections can withstand etcd unreadiness longer than the readiness timeout is. Thus, it is not necessary to drop connections in case etcd resumes its readiness before a client connection times out naturally. This is a downstream patch only as OpenShift's way of using etcd is unique.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
etcd
andetcd-readiness
checks can't be just simply commented out/removed.handleRootHealth
andInstallPathHandlerWithHealthyFunc
with extra arguments to inject the list of excluded checks.AddReadyzChecks
. Yet excluded from the final addition addition sinceAddReadyzChecks
can be invoked from multiple places.Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: