-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuring reboot window using flags as documented is not working #193
Comments
Good finding. It seems those environment variables were documented in cda5e86, but never implemented. They should be removed from the documentation. Separately, we can discuss whether it make sense to actually implement it, as right now I don't see an obvious benefit in doing so. Is there some specific reason you would prefer to use environment variables instead of CLI flags? As far as I know, it should be possible to refer to env variables in CLI args in pod spec, perhaps you can use this instead? |
Their support has never been implemented. Closes #193 Signed-off-by: Mateusz Gozdek <[email protected]>
Ok, so doing it via env variables directly is not important. I did it with both variants: command:
- "/bin/update-operator"
args:
- "--reboot-window-start=10:45"
- "--reboot-window-length=1h" and as described here: command:
- "/bin/update-operator"
args:
- "--reboot-window-start=$(UPDATE_OPERATOR_REBOOT_WINDOW_START)"
- "--reboot-window-length=$(UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH)"
env:
...
- name: UPDATE_OPERATOR_REBOOT_WINDOW_START
value: "10:30"
- name: UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH
value: "1h" But none of them is working. From inside the pod I can see: k exec -it -n reboot-coordinator flatcar-linux-update-operator-7f864bbd94-q4cm9 -- sh
/bin $ ps
PID USER TIME COMMAND
1 nobody 0:00 /bin/update-operator --reboot-window-start=10:45 --reboot-window-length=1h
20 nobody 0:00 sh
27 nobody 0:00 ps But the node still is not rebooting. The logs also doesn't show anything: $ k logs -n reboot-coordinator flatcar-linux-update-operator-7f864bbd94-q4cm9 -f
I0328 08:40:48.396657 1 main.go:108] /bin/update-operator running
I0328 08:40:48.398287 1 leaderelection.go:248] attempting to acquire leader lease reboot-coordinator/flatcar-linux-update-operator-lock...
I0328 08:40:48.430159 1 leaderelection.go:258] successfully acquired lease reboot-coordinator/flatcar-linux-update-operator-lock
I0328 08:40:49.404412 1 operator.go:593] Found 0 rebooted nodes
<for the sake of brevity>
I0328 08:58:26.923732 1 operator.go:593] Found 0 rebooted nodes And from the other pods I see the following: $ k logs -n reboot-coordinator flatcar-linux-update-agent-5swm4
I0328 08:40:48.605642 1 main.go:84] /bin/update-agent running
I0328 08:40:48.605698 1 agent.go:145] Setting info labels
I0328 08:40:48.631998 1 agent.go:151] Checking annotations
I0328 08:40:48.634019 1 agent.go:174] Setting annotations map[string]string{"flatcar-linux-update.v1.flatcar-linux.net/reboot-in-progress":"false", "flatcar-linux-update.v1.flatcar-linux.net/reboot-needed":"false"}
I0328 08:40:48.683963 1 agent.go:212] Waiting for ok-to-reboot from controller...
I0328 08:40:48.684309 1 agent.go:362] Beginning to watch update_engine status
I0328 08:40:48.685236 1 agent.go:306] Updating status
I0328 08:40:48.685250 1 agent.go:319] Indicating a reboot is needed
k logs -n reboot-coordinator flatcar-linux-update-agent-qgcrk
I0328 08:40:48.484998 1 main.go:84] /bin/update-agent running
I0328 08:40:48.486076 1 agent.go:145] Setting info labels
I0328 08:40:48.519265 1 agent.go:151] Checking annotations
I0328 08:40:48.525350 1 agent.go:174] Setting annotations map[string]string{"flatcar-linux-update.v1.flatcar-linux.net/reboot-in-progress":"false", "flatcar-linux-update.v1.flatcar-linux.net/reboot-needed":"false"}
I0328 08:40:48.552000 1 agent.go:212] Waiting for ok-to-reboot from controller...
I0328 08:40:48.552279 1 agent.go:362] Beginning to watch update_engine status
I0328 08:40:48.562810 1 agent.go:306] Updating status
k logs -n reboot-coordinator flatcar-linux-update-agent-v277b
I0328 08:40:48.488548 1 main.go:84] /bin/update-agent running
I0328 08:40:48.492439 1 agent.go:145] Setting info labels
I0328 08:40:48.517376 1 agent.go:151] Checking annotations
I0328 08:40:48.520167 1 agent.go:174] Setting annotations map[string]string{"flatcar-linux-update.v1.flatcar-linux.net/reboot-in-progress":"false", "flatcar-linux-update.v1.flatcar-linux.net/reboot-needed":"false"}
I0328 08:40:48.536459 1 agent.go:212] Waiting for ok-to-reboot from controller...
I0328 08:40:48.536617 1 agent.go:362] Beginning to watch update_engine status
I0328 08:40:48.538043 1 agent.go:306] Updating status
I0328 08:41:28.946436 1 agent.go:306] Updating status
I0328 08:41:29.075044 1 agent.go:306] Updating status
I0328 08:41:36.012535 1 agent.go:306] Updating status
I0328 08:42:06.428615 1 agent.go:306] Updating status
I0328 08:42:11.368669 1 agent.go:306] Updating status
I0328 08:42:11.369343 1 agent.go:319] Indicating a reboot is needed EDIT: But if I remove the reboot window the logs changes instantly: k logs -n reboot-coordinator flatcar-linux-update-operator-c4f798f44-t2v8c -f
I0328 09:17:16.639694 1 main.go:108] /bin/update-operator running
I0328 09:17:16.643182 1 leaderelection.go:248] attempting to acquire leader lease reboot-coordinator/flatcar-linux-update-operator-lock...
I0328 09:17:16.689928 1 leaderelection.go:258] successfully acquired lease reboot-coordinator/flatcar-linux-update-operator-lock
I0328 09:17:17.651123 1 operator.go:593] Found 0 rebooted nodes
I0328 09:17:18.051270 1 operator.go:535] Found 1 nodes that need a reboot
I0328 09:17:48.527126 1 operator.go:593] Found 0 rebooted nodes
I0328 09:17:49.110917 1 operator.go:508] Found node "kkp-test-core-cp-2" still rebooting, waiting
I0328 09:17:49.111901 1 operator.go:511] Found 1 (of max 1) rebooting nodes; waiting for completion
I0328 09:17:49.112073 1 operator.go:535] Found 0 nodes that need a reboot
I0328 09:18:19.166212 1 operator.go:593] Found 0 rebooted nodes
I0328 09:18:19.320955 1 operator.go:508] Found node "kkp-test-core-cp-2" still rebooting, waiting
I0328 09:18:19.320970 1 operator.go:511] Found 1 (of max 1) rebooting nodes; waiting for completion
I0328 09:18:19.321507 1 operator.go:535] Found 0 nodes that need a reboot |
Ok, I think I found the problem... So 2 questions: Is it possible to update the timezone for the pod? UPDATE_OPERATOR_REBOOT_WINDOW_START: Mon, Tue, Wed, Thu, Fri 09:00
UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH: 8h |
I don't know how it works in Kubernetes, but normally servers use UTC time for uniformity. I guess it's up to the host OS configuration.
Not at the moment, as this would imply multiple windows and right now only one window is supported. Probably this whole feature could be given some thought and improved, as existing implementation is already overly complex.
👍 We already have timestamp in logs, I think that should help spotting this kind of errors. |
Their support has never been implemented. Refs #193 Signed-off-by: Mateusz Gozdek <[email protected]>
Ok, I opened [RFE] Add support for multiple reboot windows This ticket than can be closed... Thanks for your support 👍 |
Description
Hi, I setup a reboot window of 2 hours via the environment variables but the reboot is not done during this window.
Impact
The node is not rebooted during the reboot window and does not get the new update.
Environment and steps to reproduce
update-operator.yaml
file:Expected behavior
The node with the update should be rebooted during the reboot window
Additional information
I can see that the labels and annotaions changes:
I can't see any problems from the logs of the pod:
And I see the environment variables correctly set to the pod:
Additional question
Is it possible to set the reboot window only for office working hours, for example:
The text was updated successfully, but these errors were encountered: