close fails with irrelevant errors after perl 5.38 #22883

toddr · 2025-01-02T18:55:27Z

After upgrade from perl 5.36 to 5.40, we began encountering issues with code that used Git::Repository. This module uses System::Command to interface with the git command line binary. We implemented a modified $git->run, which does the following:

my $select = IO::Select->new;
foreach my $fh ( $stdout, $stderr ) {
    $fh->blocking(0);
    $select->add($fh);
}

while ( $select->count ) {
    my @ready = $select->can_read;
    foreach my $fh (@ready) {
        if ( eof $fh ) {
            $select->remove($fh);
        }
        else {
            ${ $data->{$fh} } .= $_ while (<$fh>);
        }
    }
}

This code is pretty much straight out of the documentation example in IO::Select.

During cleanup, System::Command does the following to its handles for $stdin/$stdout/$stderr and this is causing a lot of noise right now for us:

sub close {
    my ($self) = @_;

    # close all pipes
    my ( $in, $out, $err ) = @{$self}{qw( stdin stdout stderr )};
    $in  and $in->opened  and $in->close  || carp "error closing stdin: $!";
    $out and $out->opened and $out->close || carp "error closing stdout: $!";
    $err and $err->opened and $err->close || carp "error closing stderr: $!";

    # and wait for the child (if any)
    $self->_reap();

    return $self;
}

It's not uncommon to see this on CPAN, and it's even worse: We see close $fh or die..., which means all of this code now unexpectedly blows up.

Digging deeper into our problem, we determined that the IO::Select method can_read is causing EAGAIN (or is it EWOULDBLOCK? They're the same error) to be put as an error on the file handle, which now causes close to exit non-zero.

In the change introduced by #20103, which addressed issues raised by #20060, they were concerned with losing errors on the file handle during close. While I can see some value in EISDIR being retained, I struggle to understand why we would retain ALL of the errors, especially if there is a successful read afterwards.

I would prefer #20103 to be reverted, but I think we're past that, so I want to suggest an alternative to @tonycoz, who worked on this originally: Possibly, we could restore the PerlIO_clearerr unless the error set are specific ones we care about. In my specific case, I would argue that EAGAIN should be cleared if there is later a successful read, right?

The text was updated successfully, but these errors were encountered:

toddr · 2025-01-02T19:42:31Z

@jkeenan while IO::Select is dist-IO, the underlying issues are with select and close which are core functions. Not sure how that'd be classified?

tonycoz · 2025-01-08T04:08:49Z

After upgrade from perl 5.36 to 5.40, we began encountering issues with code that used Git::Repository. This module uses System::Command to interface with the git command line binary. We implemented a modified $git->run, which does the following:
my $select = IO::Select->new;
foreach my $fh ( $stdout, $stderr ) {
    $fh->blocking(0);
    $select->add($fh);
}

while ( $select->count ) {
    my @ready = $select->can_read;
    foreach my $fh (@ready) {
        if ( eof $fh ) {
            $select->remove($fh);
        }
        else {
            ${ $data->{$fh} } .= $_ while (<$fh>);
        }
    }
}
This code is pretty much straight out of the documentation example in IO::Select.

This code does not attempt buffered I/O on a non-blocking handle.

During cleanup, System::Command does the following to its handles for $stdin/$stdout/$stderr and this is causing a lot of noise right now for us:
sub close {
    my ($self) = @_;

    # close all pipes
    my ( $in, $out, $err ) = @{$self}{qw( stdin stdout stderr )};
    $in  and $in->opened  and $in->close  || carp "error closing stdin: $!";
    $out and $out->opened and $out->close || carp "error closing stdout: $!";
    $err and $err->opened and $err->close || carp "error closing stderr: $!";

    # and wait for the child (if any)
    $self->_reap();

    return $self;
}
It's not uncommon to see this on CPAN, and it's even worse: We see close $fh or die..., which means all of this code now unexpectedly blows up.

Using readline in combination with select (aka IO::Select) is dangerous, if the remote process sends two lines that arrive as a single packet, readline will read the first line and leave the second line in the buffer but can_read() will not consider the handle readable until the next packet arrives, possibly deadlocking your process.

Digging deeper into our problem, we determined that the IO::Select method can_read is causing EAGAIN (or is it EWOULDBLOCK? They're the same error) to be put as an error on the file handle, which now causes close to exit non-zero.

can_read() doesn't do anything that can produce EAGAIN or EWOULDBLOCK and select() doesn't set the error flag on file handles.

(or is it EWOULDBLOCK? They're the same error)

Unfortunately, not quite, they can have different numbers, and a read(2) that would block on a non-blocking socket is permitted to return either.

In the change introduced by #20103, which addressed issues raised by #20060, they were concerned with losing errors on the file handle during close. While I can see some value in EISDIR being retained, I struggle to understand why we would retain ALL of the errors, especially if there is a successful read afterwards.

I would prefer #20103 to be reverted, but I think we're past that, so I want to suggest an alternative to @tonycoz, who worked on this originally: Possibly, we could restore the PerlIO_clearerr unless the error set are specific ones we care about. In my specific case, I would argue that EAGAIN should be cleared if there is later a successful read, right?

I do think clearing the error on an EAGAIN or EWOULDBLOCK is a reasonable workaround for historically questionable code.

I think only those two errors should be suppressed, other errors should already be handled at a lower level (EINTR), are real issues (EISDIR, EIO, EINVAL) or indicate a serious bug (EFAULT, EBADF). It's possible non-POSIX platforms return other errors reasonable to preserve.

Other functions like read() (perl's read not the system call), getc() do not do this error clearing that readline() historically did.

(or the equivalent EWOULDBLOCK) This allows questionable code that tries to combine select and the readline flavour of buffered I/O to limp along. Such code is still risky due to select() checking the underlying OS handle and not the perl handle. Fixes Perl#22883

toddr added the Needs Triage label Jan 2, 2025

jkeenan added the dist-IO issues in the dual-life blead-first IO distribution label Jan 2, 2025

jkeenan added type-core and removed Needs Triage labels Jan 2, 2025

tonycoz linked a pull request Jan 12, 2025 that will close this issue

readline: clear the error flag if the error happens to be EAGAIN #22907

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

close fails with irrelevant errors after perl 5.38 #22883

close fails with irrelevant errors after perl 5.38 #22883

toddr commented Jan 2, 2025 •

edited

Loading

toddr commented Jan 2, 2025

tonycoz commented Jan 8, 2025

close fails with irrelevant errors after perl 5.38 #22883

close fails with irrelevant errors after perl 5.38 #22883

Comments

toddr commented Jan 2, 2025 • edited Loading

toddr commented Jan 2, 2025

tonycoz commented Jan 8, 2025

toddr commented Jan 2, 2025 •

edited

Loading