Replace covert pipe with SIGCHLD handler #2550

ntrrgc · 2025-01-10T03:18:49Z

In short, when running subprocesses that share the terminal, ninja intentionally leaves a pipe open before exec() so that it can use EOF from that pipe to detect when the subprocess has exited.

That mechanism is problematic: If the subprocess ends up spawning background processes (e.g. sccache), those would also inherit the pipe by default. In that case, ninja may not detect process termination until all background processes have quitted.

This patch makes it so that, instead of propagating the pipe file descriptor to the subprocess without its knowledge, ninja keeps both ends of the pipe to itself, and uses a SIGCHLD handler to close the write end of the pipe when the subprocess has truly exited.

During testing I found Subprocess::Finish() lacked EINTR retrying, which made ninja crash prematurely. This patch also fixes that.

Fixes #2444

src/subprocess-posix.cc

digit-google · 2025-01-10T12:39:08Z

src/subprocess-posix.cc

+
+struct PidFdEntry;
+typedef unsigned int IxEntry;
+// PidFdList is used to store the PID and file descriptor pairs of the


Avoid complicated logic in the signal handler. Instead write the child pid to a self-pipe that can be waited on in DoWork() (ensure the pipe descriptors are not leaked to spawned commands too). This avoids leaking the file descriptor to a global table, which makes reasoning about lifecycle difficult (e.g. there are code paths where this descriptor will never be closed properly in your code).

Using a linked list or any kind of map to find the fd from the pid is probably not needed. Just scan the array of Subprocess instances linearly, since command termination is not in the critical performance path (even when 1000+ of commands are launched in parallel).

Just scan the array of Subprocess instances linearly

I considered this originally, but I don't know how many processes could end up in that table if someone really runs a lot of jobs share terminal, or if that is ever an use case.

Instead write the child pid to a self-pipe that can be waited on in DoWork()

I was trying to avoid modifications on DoWork(). I considered writing to pipes, but that has the additional risk that a write can potentially deadlock if the pipe buffer is full.
However, if we can rely on ppoll/pselect() returning EINTR after the first SIGCHLD signal handler execution, we could instead use a simple int field to communicate between the two, similar to how it's done for SIGINT:

Before ppoll(), set the "terminated PID" field to -1. Call ppoll(), if you get EINTR, check whether that field got a value other than -1. If it did, that's a process that is done. At that point we wouldn't even need the pipes, although they may still be useful to keep things orthogonal between the console and non-console use cases.

I considered this originally, but I don't know how many processes could end up in that table if someone really runs a lot of jobs share terminal, or if that is ever an use case.

Ah, it turns out that Ninja, in its current design, only allows to run a single "console" sub-process at a time (this is implemented elsewhere and is not visible in this file). I think this can be leveraged to avoid using a self-pipe entirely.

I was trying to avoid modifications on DoWork(). I considered writing to pipes, but that has the additional risk that a write can potentially deadlock if the pipe buffer is full.

Technically, this is extremely unlikely. In this code, the signal handler can only run during the pselect() / ppoll() call. It would require thousands of processes to all terminate during that exact syscall to block the pipe buffer (which are very large these days, see https://github.com/afborchert/pipebuf?tab=readme-ov-file#some-results for some not-so-recent practical results).

But we can avoid pipes nonetheless.

However, if we can rely on ppoll/pselect() returning EINTR after the first SIGCHLD signal handler execution, we could instead use a simple int field to communicate between the two, similar to how it's done for SIGINT:

There is no actual guarantee that the system call would return after only a single SIGCHLD signal was handled.

On the other hand, because there is only one console subprocess, it should be possible to write its pid value to a global variable that the signal handler compares to. In case of success, it would set an atomic flag to true that can be trivially checked in DoWork(). More specifically:

Add SubprocessSet::console_subproc_ as a Subprocess* pointer to the current console process if any.
Ensure that starting a new subprocess updates the pointer if needed (and assert that only one can exist).

Add two global sig_atomic_t values. One s_console_pid, will contain the pid of the console subprocess after it is started, or a special value (e.g. -1) to indicate there is no console process currently (which would be written in Subprocess::Finish). The second s_console_exited will be used as a boolean flag.

The SIGCHLD handler simply compares the signal's pid to the value of s_console_pid. If they match, it sets s_console_exited to 1.

In DoWork(), set s_console_exited to 0 before calling pselect() or ppoll(), and look at its value after the call.

Wdyt?

There is no actual guarantee that the system call would return after only a single SIGCHLD signal was handled.

Sad. Do you have a source on this by any chance?

Ah, I am trying to wrap my head around signals again, this is all so subtle, some details coming back:

SIGCHLD is a standard signals, not a real one, which means that it is not queued. When several processes terminate outside of the pselect() call, they are collapsed into a single signal handler call during the syscall (probably passing the pid of the last process). See
https://stackoverflow.com/questions/48750382/can-not-receive-all-the-sigchld

In other words, you we can only treat SIGCHLD as a boolean flag that says "some child has stopped", then have to use waitpid(..., WNOHANG) to scan the state of all processes of interest. Luckily for Ninja, that would be looking at the state of the single console process.

I checked POSIX the other day and it makes it an implementation detail whether operating systems queue signals. Indeed most don't, including Linux. QNX does!

Add two global sig_atomic_t values. One s_console_pid, will contain the pid of the console subprocess

sig_atomic_t is not guaranteed to be large enough to hold a PID, it's only guaranteed to hold 8 bits.

The SIGCHLD handler simply compares the signal's pid to the value of s_console_pid. If they match, it sets s_console_exited to 1.

Something irks me about this: Using si_pid in SIGCHLD is only safe if we only have one children and therefore one potential sender of SIGCHLD (assuming none is trying to send SIGCHLD maliciously to ninja too). But if we only have one children, we already know its PID.

And yet, it feels so fragile to assume this! An alternative would be to use waitpid with WNOHANG but that also requires tweaking the Finish() code.

I agree, it looks like the only thing that a SIGCHLD reliably indicates is that "at least one child process terminated", hence not much can be done in the signal handler, and instead the main loop should, when this even happens, call waitpid() with WNOHANG on all known child processes to verify their status.

Yesterday I retook this a little bit: The main disadvantage of waitpid(WNOHANG) for figuring out the terminated children is that it's a mutable operation, so it requires refactoring code so that the waitpid() is not done accidentally twice on the same PID (also, being very strict, the PID stops existing exactly from the time the waitpid() is called and can be reused).

Fortunately, I think this is perfectly doable within ninja's Subprocess and SubprocessSet design.

I got to implement it, but I must have gotten something wrong as I get ninja: fatal: waitpid(930444): No child processes on the first call to waitpid when running under gdb. I'll keep investigating.

Oh, regarding removing the self-pipe: I'm also doing that; however, be aware this also adds some complexity.

One of the parts where it adds complexity is that you have to support doing ppoll() and pselect() with zero file descriptors, and do it in a portable way. The man pages are a bit less helpful than I'd like:

The Linux man page for pselect() and select() talks about how in the olden days people used select() with nfds=0 and the three sets, and a timeout, as «a fairly portable way to sleep with subsecond precision».

I do hope that is indeed portable enough and it extends to pselect(), as nfds == 0 is not mentioned elsewhere in that page.

The Linux man page for ppoll() doesn't specify _Nullable for fds. It also makes no mention of whether nfds == 0 is valid or not.

However, you're allowed to use a negative fd inside a pollfd entry to have "unused" entries. I hope nfds == 1 with a single entry with fd < 0 is supported across operating systems; but again, the man pages don't mention this explicitly.

I must have gotten something wrong as I get ninja: fatal: waitpid(930444): No child processes on the first call to waitpid when running under gdb.

What I had gotten wrong was that I had written this:

while ((ret = waitpid(pid_, &status, options) < 0)) {

instead of this:

while ((ret = waitpid(pid_, &status, options)) < 0) {

After that I did some more debugging and cleaning and I'm now sending the updated patch for review.

ntrrgc · 2025-01-10T15:44:03Z

Just realized I also forgot to remove this comment:

    // In the console case, output_pipe is still inherited by the child and
    // closed when the subprocess finishes, which then notifies ninja.

ntrrgc · 2025-01-10T16:13:28Z

What is the point of HandlePendingInterruption()?

interrupted_ = 0;
int ret = ppoll(&fds.front(), nfds, NULL, &old_mask_);
if (ret == -1) {
  if (errno != EINTR) {
    perror("ninja: ppoll");
    return false;
  }
  return IsInterrupted();
}

HandlePendingInterruption();
if (IsInterrupted())
  return true;

If there is a SIGINT pending by the time HandlePendingInterruption() is called, wouldn't ppoll() handle it and return EINTR on the next DoWork() cycle?

digit-google · 2025-01-31T13:06:32Z

Regarding HandlePendingInterruption(), from looking at the git log for src/subprocess-posix.cc, it looks like this call was introduced by f2bfbda , which commit message states that:

ppoll/pselect prioritizes file descriptor events over
a signal delivery. So a flood of events prevents ninja
from reacting keyboard interruption by the user.

This CL adds a check for pending keyboard interruptions after
file descriptor events.

That would explain it.

ntrrgc · 2025-01-31T13:23:58Z

ppoll/pselect prioritizes file descriptor events over
a signal delivery. So a flood of events prevents ninja
from reacting keyboard interruption by the user.

I see, that explains the mystery, thanks for looking it up. I will try to remember to add a comment so that it's not a mystery for the next person as well.

ntrrgc · 2025-01-31T17:07:32Z

I should also update the commit and PR title now that this version doesn't use a self-pipe trick at all.

For background, see ninja-build#2444 (comment). In short, when running subprocesses that share the terminal, ninja intentionally leaves a pipe open before exec() so that it can use EOF from that pipe to detect when the subprocess has exited. That mechanism is problematic: If the subprocess ends up spawning background processes (e.g. sccache), those would also inherit the pipe by default. In that case, ninja may not detect process termination until all background processes have quitted. This patch makes it so that, instead of propagating a pipe file descriptor to the subprocess without its knowledge, ninja relies on the SIGCHLD signal together with waitpid(WNOHANG) to detect termination of console subprocesses. After this patch, console-sharing subprocesses do no longer have an associated pipe. Fixes ninja-build#2444

digit-google reviewed Jan 10, 2025

View reviewed changes

src/subprocess-posix.cc Outdated Show resolved Hide resolved

digit-google reviewed Jan 10, 2025

View reviewed changes

src/subprocess-posix.cc Show resolved Hide resolved

digit-google reviewed Jan 10, 2025

View reviewed changes

ntrrgc force-pushed the 2025-01-10-sigchld branch from d7ba31b to c7be3b8 Compare January 10, 2025 14:58

ntrrgc force-pushed the 2025-01-10-sigchld branch from c7be3b8 to fa92678 Compare January 31, 2025 16:30

ntrrgc force-pushed the 2025-01-10-sigchld branch from fa92678 to ccd3090 Compare January 31, 2025 17:07

ntrrgc changed the title ~~Replace covert pipe with self-pipe SIGCHLD handler~~ Replace covert pipe with SIGCHLD handler Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace covert pipe with SIGCHLD handler #2550

Replace covert pipe with SIGCHLD handler #2550

ntrrgc commented Jan 10, 2025

digit-google Jan 10, 2025

ntrrgc Jan 10, 2025 •

edited

Loading

digit-google Jan 10, 2025

ntrrgc Jan 10, 2025

digit-google Jan 10, 2025

ntrrgc Jan 30, 2025

ntrrgc Jan 30, 2025

digit-google Jan 31, 2025

ntrrgc Jan 31, 2025

ntrrgc Jan 31, 2025

ntrrgc commented Jan 10, 2025

ntrrgc commented Jan 10, 2025 •

edited

Loading

digit-google commented Jan 31, 2025

ntrrgc commented Jan 31, 2025

ntrrgc commented Jan 31, 2025

Replace covert pipe with SIGCHLD handler #2550

Are you sure you want to change the base?

Replace covert pipe with SIGCHLD handler #2550

Conversation

ntrrgc commented Jan 10, 2025

Choose a reason for hiding this comment

ntrrgc Jan 10, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ntrrgc commented Jan 10, 2025

ntrrgc commented Jan 10, 2025 • edited Loading

digit-google commented Jan 31, 2025

ntrrgc commented Jan 31, 2025

ntrrgc commented Jan 31, 2025

ntrrgc Jan 10, 2025 •

edited

Loading

ntrrgc commented Jan 10, 2025 •

edited

Loading