Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Docker signal handling not working #13498

Open
1 of 2 tasks
fschulze-dtm opened this issue Sep 12, 2024 · 4 comments
Open
1 of 2 tasks

[Bug] Docker signal handling not working #13498

fschulze-dtm opened this issue Sep 12, 2024 · 4 comments

Comments

@fschulze-dtm
Copy link

fschulze-dtm commented Sep 12, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Version

iotdb 1.3.1-standalone

Describe the bug and provide the minimal reproduce step

When stoping a docker container running the apache/iotdb:1.3.1-standalone image the SIGTERM signal handling trap is not executed leading to a non graceful shut down. This is because the entrypoint.sh script uses exec which destroys signal handlers using trap.

Furthermore, the function that should be executed at SIGTERM 'on_stop' defined in entrypoint.sh has the if statement "$start_what" != "all".` Therfore in standalone mode the corresponding graceful shutdown is not executed.

To reproduce run the docker container and then stop it.

What did you expect to see?

The on_stop function defined in entrypoint.sh is executed when the docker container is stopped providing a graceful shutdown with FLUSH.

What did you see instead?

Rapid shut down without proper SIGNAL handling and without execution of the on_stop function.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
Copy link

Hi, this is your first issue in IoTDB project. Thanks for your report. Welcome to join the community!

@CritasWang
Copy link
Collaborator

Is there no problem with the logic here

if [[ "$start_what" != "confignode" ]]; then
        echo "###### manually flush ######";
        start-cli.sh -e "flush;" || true
        stop-datanode.sh
        echo "##### done ######";
    else
        stop-confignode.sh;
    fi

@fschulze-dtm
Copy link
Author

Is there no problem with the logic here

if [[ "$start_what" != "confignode" ]]; then
        echo "###### manually flush ######";
        start-cli.sh -e "flush;" || true
        stop-datanode.sh
        echo "##### done ######";
    else
        stop-confignode.sh;
    fi

This is the code snippet from apache/iotdb:1.3.2-standalone image. In apache/iotdb:1.3.1-standalone it is

if [[ "$start_what" == "datanode" ]]; then
    echo "###### manually flush ######";
    start-cli.sh -e "flush;" || true
    echo "stopping datanode service";
    stop-datanode.sh ;
    echo "##### done ######";
elif [[ "$start_what" != "all" ]]; then
    echo "###### manually flush ######";
    start-cli.sh -e "flush;" || true
    echo "stopping confignode and datanode service";
    stop-standalone.sh ;
    echo "##### done ######";
elif [[ "$start_what" == "confignode" ]]; then
    echo "stopping confignode service";
    stop-confignode.sh;
    echo "##### done ######";
fi

Also the main problem of using exec in the entrypoint.sh which kills the trap remains.

@CritasWang
Copy link
Collaborator

Is there no problem with the logic here

if [[ "$start_what" != "confignode" ]]; then
        echo "###### manually flush ######";
        start-cli.sh -e "flush;" || true
        stop-datanode.sh
        echo "##### done ######";
    else
        stop-confignode.sh;
    fi

This is the code snippet from apache/iotdb:1.3.2-standalone image. In apache/iotdb:1.3.1-standalone it is

if [[ "$start_what" == "datanode" ]]; then
    echo "###### manually flush ######";
    start-cli.sh -e "flush;" || true
    echo "stopping datanode service";
    stop-datanode.sh ;
    echo "##### done ######";
elif [[ "$start_what" != "all" ]]; then
    echo "###### manually flush ######";
    start-cli.sh -e "flush;" || true
    echo "stopping confignode and datanode service";
    stop-standalone.sh ;
    echo "##### done ######";
elif [[ "$start_what" == "confignode" ]]; then
    echo "stopping confignode service";
    stop-confignode.sh;
    echo "##### done ######";
fi

Also the main problem of using exec in the entrypoint.sh which kills the trap remains.

Actually, an elegant shutdown only requires calling the stop script.

start-cli.sh -e "flush;"

This operation is just a guarantee mechanism, and after calling the stop script, the program will also perform corresponding elegant shutdown processing internally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants