Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The state.json should be generated prior to the creation of the cgroup. #4535

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions libcontainer/process_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -561,6 +561,13 @@ func (p *initProcess) start() (retErr error) {
}
}()

// A SIGKILL can happen at any time, and without the state.json,
// the 'runc delete --force' command won't be able to clear the cgroup.
_, err = p.container.updateState(p)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
We should know that we will write the pid of init process to state.json, so when we do delete -f, once this init process has dead, but runc stage 2 process is alive, I think maybe we still can't remove this cgroup path either because there is still one process in this cgroup.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing, I discovered that when systemd is used to handle cgroup, it gets cleaned up once runc init has exited. However, if we don't use systemd, some remnants of cgroup are left behind.

If we're not using systemd to manage, we might need to do a check in runc delete to see if runc init is still listed in cgroups.procs. This shouldn't take too much time since runc init is bound to exit due to an error. This error happens because when runc init gets to parts like procHooks that require synchronization with the parent process, it fails as the parent process has already been terminated, leading to errors when runc init tries to write or read values from the pipeline, and consequently, it exits.

{BF5FEB57-6023-4623-A04F-7C96FC189661} {F284436E-31E8-4967-AE89-18A124B1C24C}

if err != nil {
return fmt.Errorf("unable to store init state: %w", err)
}

// Do this before syncing with child so that no children can escape the
// cgroup. We don't need to worry about not doing this and not being root
// because we'd be using the rootless cgroup manager in that case.
Expand Down
Loading