Configuring reboot window using flags as documented is not working #193

steled · 2023-03-28T07:23:21Z

Description

Hi, I setup a reboot window of 2 hours via the environment variables but the reboot is not done during this window.

Impact

The node is not rebooted during the reboot window and does not get the new update.

Environment and steps to reproduce

Set-up: I setup FLUO as described at the Usage section
Task: for the reboot window I added the following lines to the update-operator.yaml file:

        env:
        ...
        - name: UPDATE_OPERATOR_REBOOT_WINDOW_START
          value: "09:00"
        - name: UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH
          value: "2h"

Action(s): to trigger an update I'm jumping to another channel
Error: the node isn't rebooted during the reboot window

Expected behavior

The node with the update should be rebooted during the reboot window

Additional information

I can see that the labels and annotaions changes:

Labels:             flatcar-linux-update.v1.flatcar-linux.net/group=stable
                    flatcar-linux-update.v1.flatcar-linux.net/id=flatcar
                    flatcar-linux-update.v1.flatcar-linux.net/reboot-needed=true
                    flatcar-linux-update.v1.flatcar-linux.net/version=3374.2.5
                    v1.kubeone.io/operating-system=flatcar
Annotations:        flatcar-linux-update.v1.flatcar-linux.net/last-checked-time: 1679986757
                    flatcar-linux-update.v1.flatcar-linux.net/new-version: 3510.1.0
                    flatcar-linux-update.v1.flatcar-linux.net/reboot-in-progress: false
                    flatcar-linux-update.v1.flatcar-linux.net/reboot-needed: true
                    flatcar-linux-update.v1.flatcar-linux.net/status: UPDATE_STATUS_UPDATED_NEED_REBOOT

I can't see any problems from the logs of the pod:

$ kubectl logs -n reboot-coordinator flatcar-linux-update-operator-85b99fd865-swqgc
I0328 06:58:07.376415       1 main.go:108] /bin/update-operator running
I0328 06:58:07.376546       1 leaderelection.go:248] attempting to acquire leader lease reboot-coordinator/flatcar-linux-update-operator-lock...
I0328 06:58:07.401639       1 leaderelection.go:258] successfully acquired lease reboot-coordinator/flatcar-linux-update-operator-lock
I0328 06:58:08.382212       1 operator.go:593] Found 0 rebooted nodes
<for the sake of brevity>
I0328 07:18:47.188086       1 operator.go:593] Found 0 rebooted nodes

And I see the environment variables correctly set to the pod:

kubectl exec -it -n reboot-coordinator flatcar-linux-update-operator-85b99fd865-swqgc -- sh
/bin $ env
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.96.0.1:443
HOSTNAME=flatcar-linux-update-operator-85b99fd865-swqgc
SHLVL=1
UPDATE_OPERATOR_REBOOT_WINDOW_START=09:00
HOME=/
TERM=xterm
UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH=2h
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
POD_NAMESPACE=reboot-coordinator
KUBERNETES_SERVICE_HOST=10.96.0.1
PWD=/bin

Additional question

Is it possible to set the reboot window only for office working hours, for example:

UPDATE_OPERATOR_REBOOT_WINDOW_START: Mon, Tue, Wed, Thu, Fri 09:00
UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH: 8h

The text was updated successfully, but these errors were encountered:

invidian · 2023-03-28T07:35:55Z

Good finding. It seems those environment variables were documented in cda5e86, but never implemented. They should be removed from the documentation.

Separately, we can discuss whether it make sense to actually implement it, as right now I don't see an obvious benefit in doing so. Is there some specific reason you would prefer to use environment variables instead of CLI flags? As far as I know, it should be possible to refer to env variables in CLI args in pod spec, perhaps you can use this instead?

Their support has never been implemented. Closes #193 Signed-off-by: Mateusz Gozdek <[email protected]>

steled · 2023-03-28T09:00:54Z

Ok, so doing it via env variables directly is not important.

I did it with both variants:

        command:
        - "/bin/update-operator"
        args:
          - "--reboot-window-start=10:45"
          - "--reboot-window-length=1h"

and as described here:

        command:
        - "/bin/update-operator"
        args:
          - "--reboot-window-start=$(UPDATE_OPERATOR_REBOOT_WINDOW_START)"
          - "--reboot-window-length=$(UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH)"
        env:
        ...
        - name: UPDATE_OPERATOR_REBOOT_WINDOW_START
          value: "10:30"
        - name: UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH
          value: "1h"

But none of them is working.

From inside the pod I can see:

k exec -it -n reboot-coordinator flatcar-linux-update-operator-7f864bbd94-q4cm9 -- sh
/bin $ ps
PID   USER     TIME  COMMAND
    1 nobody    0:00 /bin/update-operator --reboot-window-start=10:45 --reboot-window-length=1h
   20 nobody    0:00 sh
   27 nobody    0:00 ps

But the node still is not rebooting.

The logs also doesn't show anything:

$ k logs -n reboot-coordinator flatcar-linux-update-operator-7f864bbd94-q4cm9 -f
I0328 08:40:48.396657       1 main.go:108] /bin/update-operator running
I0328 08:40:48.398287       1 leaderelection.go:248] attempting to acquire leader lease reboot-coordinator/flatcar-linux-update-operator-lock...
I0328 08:40:48.430159       1 leaderelection.go:258] successfully acquired lease reboot-coordinator/flatcar-linux-update-operator-lock
I0328 08:40:49.404412       1 operator.go:593] Found 0 rebooted nodes
<for the sake of brevity>
I0328 08:58:26.923732       1 operator.go:593] Found 0 rebooted nodes

And from the other pods I see the following:

$ k logs -n reboot-coordinator flatcar-linux-update-agent-5swm4
I0328 08:40:48.605642       1 main.go:84] /bin/update-agent running
I0328 08:40:48.605698       1 agent.go:145] Setting info labels
I0328 08:40:48.631998       1 agent.go:151] Checking annotations
I0328 08:40:48.634019       1 agent.go:174] Setting annotations map[string]string{"flatcar-linux-update.v1.flatcar-linux.net/reboot-in-progress":"false", "flatcar-linux-update.v1.flatcar-linux.net/reboot-needed":"false"}
I0328 08:40:48.683963       1 agent.go:212] Waiting for ok-to-reboot from controller...
I0328 08:40:48.684309       1 agent.go:362] Beginning to watch update_engine status
I0328 08:40:48.685236       1 agent.go:306] Updating status
I0328 08:40:48.685250       1 agent.go:319] Indicating a reboot is needed

k logs -n reboot-coordinator flatcar-linux-update-agent-qgcrk
I0328 08:40:48.484998       1 main.go:84] /bin/update-agent running
I0328 08:40:48.486076       1 agent.go:145] Setting info labels
I0328 08:40:48.519265       1 agent.go:151] Checking annotations
I0328 08:40:48.525350       1 agent.go:174] Setting annotations map[string]string{"flatcar-linux-update.v1.flatcar-linux.net/reboot-in-progress":"false", "flatcar-linux-update.v1.flatcar-linux.net/reboot-needed":"false"}
I0328 08:40:48.552000       1 agent.go:212] Waiting for ok-to-reboot from controller...
I0328 08:40:48.552279       1 agent.go:362] Beginning to watch update_engine status
I0328 08:40:48.562810       1 agent.go:306] Updating status

k logs -n reboot-coordinator flatcar-linux-update-agent-v277b
I0328 08:40:48.488548       1 main.go:84] /bin/update-agent running
I0328 08:40:48.492439       1 agent.go:145] Setting info labels
I0328 08:40:48.517376       1 agent.go:151] Checking annotations
I0328 08:40:48.520167       1 agent.go:174] Setting annotations map[string]string{"flatcar-linux-update.v1.flatcar-linux.net/reboot-in-progress":"false", "flatcar-linux-update.v1.flatcar-linux.net/reboot-needed":"false"}
I0328 08:40:48.536459       1 agent.go:212] Waiting for ok-to-reboot from controller...
I0328 08:40:48.536617       1 agent.go:362] Beginning to watch update_engine status
I0328 08:40:48.538043       1 agent.go:306] Updating status
I0328 08:41:28.946436       1 agent.go:306] Updating status
I0328 08:41:29.075044       1 agent.go:306] Updating status
I0328 08:41:36.012535       1 agent.go:306] Updating status
I0328 08:42:06.428615       1 agent.go:306] Updating status
I0328 08:42:11.368669       1 agent.go:306] Updating status
I0328 08:42:11.369343       1 agent.go:319] Indicating a reboot is needed

EDIT:

But if I remove the reboot window the logs changes instantly:

k logs -n reboot-coordinator flatcar-linux-update-operator-c4f798f44-t2v8c -f
I0328 09:17:16.639694       1 main.go:108] /bin/update-operator running
I0328 09:17:16.643182       1 leaderelection.go:248] attempting to acquire leader lease reboot-coordinator/flatcar-linux-update-operator-lock...
I0328 09:17:16.689928       1 leaderelection.go:258] successfully acquired lease reboot-coordinator/flatcar-linux-update-operator-lock
I0328 09:17:17.651123       1 operator.go:593] Found 0 rebooted nodes
I0328 09:17:18.051270       1 operator.go:535] Found 1 nodes that need a reboot
I0328 09:17:48.527126       1 operator.go:593] Found 0 rebooted nodes
I0328 09:17:49.110917       1 operator.go:508] Found node "kkp-test-core-cp-2" still rebooting, waiting
I0328 09:17:49.111901       1 operator.go:511] Found 1 (of max 1) rebooting nodes; waiting for completion
I0328 09:17:49.112073       1 operator.go:535] Found 0 nodes that need a reboot
I0328 09:18:19.166212       1 operator.go:593] Found 0 rebooted nodes
I0328 09:18:19.320955       1 operator.go:508] Found node "kkp-test-core-cp-2" still rebooting, waiting
I0328 09:18:19.320970       1 operator.go:511] Found 1 (of max 1) rebooting nodes; waiting for completion
I0328 09:18:19.321507       1 operator.go:535] Found 0 nodes that need a reboot

steled · 2023-03-28T11:32:03Z

Ok, I think I found the problem...
The time in the pod is GMT but we are at GMT +2

So 2 questions:

Is it possible to update the timezone for the pod?
Is it possible to set the reboot window only for office working hours, for example:

UPDATE_OPERATOR_REBOOT_WINDOW_START: Mon, Tue, Wed, Thu, Fri 09:00
UPDATE_OPERATOR_REBOOT_WINDOW_LENGTH: 8h

invidian · 2023-03-28T11:41:58Z

Is it possible to update the timezone for the pod?

I don't know how it works in Kubernetes, but normally servers use UTC time for uniformity. I guess it's up to the host OS configuration.

Is it possible to set the reboot window only for office working hours, for example:

Not at the moment, as this would imply multiple windows and right now only one window is supported.

Probably this whole feature could be given some thought and improved, as existing implementation is already overly complex.

The time in the pod is GMT but we are at GMT +2

👍 We already have timestamp in logs, I think that should help spotting this kind of errors.

Their support has never been implemented. Refs #193 Signed-off-by: Mateusz Gozdek <[email protected]>

steled · 2023-03-28T11:55:46Z

Ok, I opened [RFE] Add support for multiple reboot windows

This ticket than can be closed...

Thanks for your support 👍

invidian changed the title ~~Reboot window is not working~~ Configuring reboot window using environment variables as documented is not working Mar 28, 2023

invidian added the documentation Improvements or additions to documentation label Mar 28, 2023

invidian added a commit that referenced this issue Mar 28, 2023

doc/reboot-windows.md: remove environment variables references

13bbb74

Their support has never been implemented. Closes #193 Signed-off-by: Mateusz Gozdek <[email protected]>

invidian mentioned this issue Mar 28, 2023

doc/reboot-windows.md: remove environment variables references #194

Merged

steled changed the title ~~Configuring reboot window using environment variables as documented is not working~~ Configuring reboot window using flags as documented is not working Mar 28, 2023

invidian added a commit that referenced this issue Mar 28, 2023

doc/reboot-windows.md: remove environment variables references

64e137a

Their support has never been implemented. Refs #193 Signed-off-by: Mateusz Gozdek <[email protected]>

steled closed this as completed Mar 28, 2023

invidian mentioned this issue Mar 28, 2023

[RFE] Add support for multiple reboot windows #195

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuring reboot window using flags as documented is not working #193

Configuring reboot window using flags as documented is not working #193

steled commented Mar 28, 2023

invidian commented Mar 28, 2023 •

edited

Loading

steled commented Mar 28, 2023 •

edited

Loading

steled commented Mar 28, 2023

invidian commented Mar 28, 2023

steled commented Mar 28, 2023

Configuring reboot window using flags as documented is not working #193

Configuring reboot window using flags as documented is not working #193

Comments

steled commented Mar 28, 2023

Description

Impact

Environment and steps to reproduce

Expected behavior

Additional information

Additional question

invidian commented Mar 28, 2023 • edited Loading

steled commented Mar 28, 2023 • edited Loading

steled commented Mar 28, 2023

invidian commented Mar 28, 2023

steled commented Mar 28, 2023

invidian commented Mar 28, 2023 •

edited

Loading

steled commented Mar 28, 2023 •

edited

Loading