Improve drain script support #10

jandubois · 2021-05-10T22:41:11Z

This PR tries to address the remaining issues in the Quarks operator regarding drain script support.

Many thanks to @univ0298 for testing the changes and @andrew-edgar for updating the tests!

History

The first identified issue was: POD draining does not work because individual containers terminate prematurely instead of waiting for the completion of all drain/preStops on all containers.

It was supposed to be fixed by: Wait for all drain scripts to finish

That fix was incomplete because containers whose main process exits will get killed immediately and not participate in the drain script waiting process. The first idea how to fix this did not work out: Make sure terminated containers are considered as already drained

Further problems are identified in: Many Cloud Foundry drain scripts do not work in KubeCF due to no monit and no bpm

Fixes

This pull request attempts to address the remaining issues with:

keep container-run running until it receives a SIGTERM signal. That way PreStop hooks should not get killed prematurely.
provide an emultation bpm script that implements the common bpm stop command used by many of the drain scripts in the CF releases. In addition it provides a mechanism to implement patches for those drain scripts that directly use the pid files to signal other processes.

In addition this PR fixes the way subprocesses are stopped via SIGKILL, orphaning their child processes and generally not allowing things to shut down gracefully. This PR mirrors the bpm stop functionality and sends a SIGTERM, followed 20s later with a SIGKILL if the child processes are still running at that time.

BPM commands

The bpm script implements the following commands from the bpm-release:

bpm start job [-p process]
bpm stop job [-p process ]

The bpm stop command will wait up to 30s for the process to actually stop (it normally should be killed via SIGKILL within 20s if it doesn't respong to SIGTERM).

The following extensions are also provided:

bpm term job [-p process]
bpm quit job [-p process]
bpm running job [-p process]

bpm term is the same as bpm stop, except it doesn't wait for the process to terminate.
bpm quit will send a SIGQUIT signal to the process (e.g. to tell nginx to shut down gracefully).
bpm running will return with a 0 exit code if the process is still running, or a 1 if it isn't. It will also print "yes" or "no" to stdout, if that is connected to a tty.

Manual testing

I compiled the program on Linux (using nc with UDP to Unix sockets doesn't seem to work on macOS):

go build -o container-run cmd/main.go

I use the following test program for all tests:

# cat main.sh 
#!/usr/bin/env bash

SIGQUIT=""
function quit {
    echo SIGQUIT
    SIGQUIT=1
}
trap quit SIGQUIT

function term {
    echo SIGTERM
    if [ -n "$SIGQUIT" ]; then
        echo "Exiting because already received SIGQUIT"
        exit 0
    fi
}
trap term SIGTERM

echo Start
for i in $(seq 10); do
    echo Loop $i
    sleep 3
done
echo End
exit 0

It runs for 30s and then exits on its own, printing a progress message every 3 seconds. It acknowledges SIGQUIT and SIGTERM signals, but only exits early if the SIGTERM has been preceeded by a SIGQUIT. It is always run via:

# ./container-run --job-name main --process-name main $PWD/main.sh; echo $?
Start
Loop 1
[...]
Loop 10
End

The main.sh process runs until completion, but container-run remains running. From a second shell main.sh can be restarted:

# alias bpm=/var/vcap/jobs/bpm/bin/bpm
# bpm start main

and the process runs again from start to finish. Running pkill container-run once the script has finished will terminate container-run immediately now.

Starting container-run again and running bpm stop main from the second terminal shows that the signal is received (once the current sleep returns), but the process doesn't abort right away; it runs for another 20s before being killed:

# time bpm stop main

real    0m20.040s

Output:

# ./container-run --job-name main --process-name main $PWD/main.sh; echo $?
Start
Loop 1
Loop 2
SIGTERM
Loop 3
Loop 4
Loop 5
Loop 6
Loop 7
Loop 8

Using bpm term main works the same as bpm stop main, but returns immediately and doesn't wait for the process to stop.

Sending a SIGQUIT before the stop command lets the script terminate right away, but container-run will keep running as expected:

# bpm start main; sleep 3; bpm quit main; bpm stop main

Output:

[...]
Loop 8
Start
Loop 1
Loop 2
SIGQUIT
SIGTERM
Exiting because already received SIGQUIT

Sending SIGTERM to container-run while main.sh is running will send SIGTERM to the script as well. It will continue running because it hasn't received a SIGQUIT first. After 20s the script will be killed via SIGKILL and container-run exits:

# bpm start main; sleep 3; pkill container-run

Output:

[...]
Exiting because already received SIGQUIT
Start
Loop 1
Loop 2
SIGTERM
Loop 3
Loop 4
Loop 5
Loop 6
Loop 7
Loop 8
0
#

The bpm running command can be used to check if the process is still running or not:

# bpm start main; bpm term main; while bpm running main; do sleep 5; done
yes
yes
yes
yes
no

Open issues

In addition to a new operator release using the updated container-run version we also need at least 2 patches at the kubecf level:

The cloud controller drain.sh script calls shutdown_drain.rb using cloud_controller_ng/drain.rb. It should be possible to replace the drain script with something like:
```
bpm quit nginx
for i in $(seq 30); do
    test bpm running nginx && sleep 1
done
bpm term nginx
bpm term cloud_controller_ng
```
The garden process invokes garden_start via tini. Sine we don't run tini using pid 1, we need to pass the -s option to enable the TINI_SUBREAPER.

Signed-off-by: Jan Dubois <[email protected]>

…ss fails To avoid terminating early and killing a still running drain script it is necessary to keep the container-run process running even if the main process has already exited. Once container-run receives the SIGTERM signal it will pass it on to the child processes and then wait until all direct children have exited before exiting itself. This should allow the processes to terminate cleanly as long as the grace period has not yet expired. container-run will also terminate immediately if the main process terminates with an error. This allows the container to fail and be restarted by k8s. Signed-off-by: Jan Dubois <[email protected]>

This is used to trigger graceful shutdown of nginx. Signed-off-by: Jan Dubois <[email protected]>

…ning Can be used to wait for a process to stop after sending the STOP command. Signed-off-by: Jan Dubois <[email protected]>

It only supports the "start" and "stop" commands from the real bpm, but has additional "quit" and "running" commands to help reimplementing drain scripts that make use of the pid files. Signed-off-by: Jan Dubois <[email protected]>

Signed-off-by: Jan Dubois <[email protected]>

It is important to process errors synchronously because their handling depends on the value of the 'active' flag. To avoid deadlocks any code sending errors to the channel must run in a blockable goroutine. Signed-off-by: Jan Dubois <[email protected]>

(and give them a chance to notify their child processes). If the processes haven't terminated in 20s, send SIGKILL. Signed-off-by: Jan Dubois <[email protected]>

Signed-off-by: Jan Dubois <[email protected]>

That way it also starts the timeout trigger if the process doesn't stop within 20s. Signed-off-by: Jan Dubois <[email protected]>

The rest are just confusing the debug output, and passing on e.g. SIGCHLD to the child processes is neither correct nor useful. Signed-off-by: Jan Dubois <[email protected]>

container-run should send a SIGKILL if the process is still alive after 20s, so if it isn't gone after 30, then it probably will not stop. The `bpm running` subcommand will also now print "yes" or "no" in addition to setting the exit code. Signed-off-by: Jan Dubois <[email protected]>

(but they no longer pass because they have not been updated with the new application logic). Signed-off-by: Jan Dubois <[email protected]>

Also modify `bpm running` to only print yes/no when stdout is a tty. Signed-off-by: Jan Dubois <[email protected]>

jandubois · 2021-05-11T00:15:46Z

we need to pass the -s option to enable the TINI_SUBREAPER

cloudfoundry-incubator/kubecf#1729

jandubois · 2021-05-11T00:17:52Z

The cloud controller drain.sh script calls shutdown_drain.rb using cloud_controller_ng/drain.rb.

This new nginx drain logic was added in capi-release v1.110.0 via cloudfoundry/capi-release@30690f3.

kubecf is currently using capi-release v1.107.0, so this does not yet apply.

mook-as

Not really sure my review is comprehensive, but it's something at least I guess?

pkg/containerrun/bpm.go

pkg/containerrun/containerrun.go

Signed-off-by: Jan Dubois <[email protected]>

The errors channel is unbuffered, so stopProcesses cannot deliver errors if it is running from the same goroutine as the receiver. Signed-off-by: Jan Dubois <[email protected]>

Signed-off-by: Jan Dubois <[email protected]>

manno · 2021-05-12T07:52:34Z

That looks great.

One thing I can think of, if any of the goroutines in the tests gives a problem (flakieness, etc) maybe defer GinkgoRecover() helps.

jandubois and others added 16 commits May 6, 2021 18:06

Unregister subprocesses once they are no longer running

e2d5ef6

Signed-off-by: Jan Dubois <[email protected]>

Add QUIT command to send SIGQUIT to process

f009f41

This is used to trigger graceful shutdown of nginx. Signed-off-by: Jan Dubois <[email protected]>

Create status indicator on the shared volume while the process is run…

50be76f

…ning Can be used to wait for a process to stop after sending the STOP command. Signed-off-by: Jan Dubois <[email protected]>

Add --debug option

484a7fb

Signed-off-by: Jan Dubois <[email protected]>

Use SIGTERM instead of SIGKILL to STOP processes

2b39e8a

(and give them a chance to notify their child processes). If the processes haven't terminated in 20s, send SIGKILL. Signed-off-by: Jan Dubois <[email protected]>

Add some additional debug logging

6cb69e3

Signed-off-by: Jan Dubois <[email protected]>

Unregister process as soon as it quits

d72b104

Signed-off-by: Jan Dubois <[email protected]>

Call stopProcesses on SIGTERM (and SIGINT for testing)

f4213ce

That way it also starts the timeout trigger if the process doesn't stop within 20s. Signed-off-by: Jan Dubois <[email protected]>

Only handle SIGINT, SIGQUIT, and SIGTERM signals

2f616e5

The rest are just confusing the debug output, and passing on e.g. SIGCHLD to the child processes is neither correct nor useful. Signed-off-by: Jan Dubois <[email protected]>

Make the tests compile again

8480c60

(but they no longer pass because they have not been updated with the new application logic). Signed-off-by: Jan Dubois <[email protected]>

Add bpm term command; same as stop, but don't wait

f48b916

Also modify `bpm running` to only print yes/no when stdout is a tty. Signed-off-by: Jan Dubois <[email protected]>

Refactor tests to work when Run does not exit

a3870a4

jandubois requested a review from mook-as May 10, 2021 22:41

jandubois mentioned this pull request May 11, 2021

Bump cf-deployment cloudfoundry-incubator/kubecf#1730

Open

jandubois requested a review from manno May 11, 2021 00:24

mook-as suggested changes May 11, 2021

View reviewed changes

Code review feedback

3b5b6bb

Signed-off-by: Jan Dubois <[email protected]>

jandubois requested a review from mook-as May 11, 2021 16:31

jandubois added 2 commits May 11, 2021 10:18

Run stopProcesses in a goroutine when called from the main goroutine

2776822

The errors channel is unbuffered, so stopProcesses cannot deliver errors if it is running from the same goroutine as the receiver. Signed-off-by: Jan Dubois <[email protected]>

Fix indentation

87b8fed

Signed-off-by: Jan Dubois <[email protected]>

mook-as approved these changes May 11, 2021

View reviewed changes

manno merged commit 76ca394 into master May 12, 2021

manno deleted the drain-support branch May 12, 2021 07:52

jandubois mentioned this pull request May 18, 2021

Create pidfile and "bpm pid" command #11

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve drain script support #10

Improve drain script support #10

jandubois commented May 10, 2021 •

edited

Loading

jandubois commented May 11, 2021

jandubois commented May 11, 2021

mook-as left a comment

manno commented May 12, 2021

Improve drain script support #10

Improve drain script support #10

Conversation

jandubois commented May 10, 2021 • edited Loading

History

Fixes

BPM commands

Manual testing

Open issues

jandubois commented May 11, 2021

jandubois commented May 11, 2021

mook-as left a comment

Choose a reason for hiding this comment

manno commented May 12, 2021

jandubois commented May 10, 2021 •

edited

Loading