Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

close fails with irrelevant errors after perl 5.38 #22883

Open
toddr opened this issue Jan 2, 2025 · 2 comments · May be fixed by #22907
Open

close fails with irrelevant errors after perl 5.38 #22883

toddr opened this issue Jan 2, 2025 · 2 comments · May be fixed by #22907
Labels
dist-IO issues in the dual-life blead-first IO distribution type-core

Comments

@toddr
Copy link
Member

toddr commented Jan 2, 2025

After upgrade from perl 5.36 to 5.40, we began encountering issues with code that used Git::Repository. This module uses System::Command to interface with the git command line binary. We implemented a modified $git->run, which does the following:

my $select = IO::Select->new;
foreach my $fh ( $stdout, $stderr ) {
    $fh->blocking(0);
    $select->add($fh);
}

while ( $select->count ) {
    my @ready = $select->can_read;
    foreach my $fh (@ready) {
        if ( eof $fh ) {
            $select->remove($fh);
        }
        else {
            ${ $data->{$fh} } .= $_ while (<$fh>);
        }
    }
}

This code is pretty much straight out of the documentation example in IO::Select.

During cleanup, System::Command does the following to its handles for $stdin/$stdout/$stderr and this is causing a lot of noise right now for us:

sub close {
    my ($self) = @_;

    # close all pipes
    my ( $in, $out, $err ) = @{$self}{qw( stdin stdout stderr )};
    $in  and $in->opened  and $in->close  || carp "error closing stdin: $!";
    $out and $out->opened and $out->close || carp "error closing stdout: $!";
    $err and $err->opened and $err->close || carp "error closing stderr: $!";

    # and wait for the child (if any)
    $self->_reap();

    return $self;
}

It's not uncommon to see this on CPAN, and it's even worse: We see close $fh or die..., which means all of this code now unexpectedly blows up.

Digging deeper into our problem, we determined that the IO::Select method can_read is causing EAGAIN (or is it EWOULDBLOCK? They're the same error) to be put as an error on the file handle, which now causes close to exit non-zero.

In the change introduced by #20103, which addressed issues raised by #20060, they were concerned with losing errors on the file handle during close. While I can see some value in EISDIR being retained, I struggle to understand why we would retain ALL of the errors, especially if there is a successful read afterwards.

I would prefer #20103 to be reverted, but I think we're past that, so I want to suggest an alternative to @tonycoz, who worked on this originally: Possibly, we could restore the PerlIO_clearerr unless the error set are specific ones we care about. In my specific case, I would argue that EAGAIN should be cleared if there is later a successful read, right?

@jkeenan jkeenan added the dist-IO issues in the dual-life blead-first IO distribution label Jan 2, 2025
@toddr
Copy link
Member Author

toddr commented Jan 2, 2025

@jkeenan while IO::Select is dist-IO, the underlying issues are with select and close which are core functions. Not sure how that'd be classified?

@tonycoz
Copy link
Contributor

tonycoz commented Jan 8, 2025

After upgrade from perl 5.36 to 5.40, we began encountering issues with code that used Git::Repository. This module uses System::Command to interface with the git command line binary. We implemented a modified $git->run, which does the following:

my $select = IO::Select->new;
foreach my $fh ( $stdout, $stderr ) {
    $fh->blocking(0);
    $select->add($fh);
}

while ( $select->count ) {
    my @ready = $select->can_read;
    foreach my $fh (@ready) {
        if ( eof $fh ) {
            $select->remove($fh);
        }
        else {
            ${ $data->{$fh} } .= $_ while (<$fh>);
        }
    }
}

This code is pretty much straight out of the documentation example in IO::Select.

This code does not attempt buffered I/O on a non-blocking handle.

During cleanup, System::Command does the following to its handles for $stdin/$stdout/$stderr and this is causing a lot of noise right now for us:

sub close {
    my ($self) = @_;

    # close all pipes
    my ( $in, $out, $err ) = @{$self}{qw( stdin stdout stderr )};
    $in  and $in->opened  and $in->close  || carp "error closing stdin: $!";
    $out and $out->opened and $out->close || carp "error closing stdout: $!";
    $err and $err->opened and $err->close || carp "error closing stderr: $!";

    # and wait for the child (if any)
    $self->_reap();

    return $self;
}

It's not uncommon to see this on CPAN, and it's even worse: We see close $fh or die..., which means all of this code now unexpectedly blows up.

Using readline in combination with select (aka IO::Select) is dangerous, if the remote process sends two lines that arrive as a single packet, readline will read the first line and leave the second line in the buffer but can_read() will not consider the handle readable until the next packet arrives, possibly deadlocking your process.

Digging deeper into our problem, we determined that the IO::Select method can_read is causing EAGAIN (or is it EWOULDBLOCK? They're the same error) to be put as an error on the file handle, which now causes close to exit non-zero.

can_read() doesn't do anything that can produce EAGAIN or EWOULDBLOCK and select() doesn't set the error flag on file handles.

(or is it EWOULDBLOCK? They're the same error)

Unfortunately, not quite, they can have different numbers, and a read(2) that would block on a non-blocking socket is permitted to return either.

In the change introduced by #20103, which addressed issues raised by #20060, they were concerned with losing errors on the file handle during close. While I can see some value in EISDIR being retained, I struggle to understand why we would retain ALL of the errors, especially if there is a successful read afterwards.

I would prefer #20103 to be reverted, but I think we're past that, so I want to suggest an alternative to @tonycoz, who worked on this originally: Possibly, we could restore the PerlIO_clearerr unless the error set are specific ones we care about. In my specific case, I would argue that EAGAIN should be cleared if there is later a successful read, right?

I do think clearing the error on an EAGAIN or EWOULDBLOCK is a reasonable workaround for historically questionable code.

I think only those two errors should be suppressed, other errors should already be handled at a lower level (EINTR), are real issues (EISDIR, EIO, EINVAL) or indicate a serious bug (EFAULT, EBADF). It's possible non-POSIX platforms return other errors reasonable to preserve.

Other functions like read() (perl's read not the system call), getc() do not do this error clearing that readline() historically did.

tonycoz added a commit to tonycoz/perl5 that referenced this issue Jan 9, 2025
(or the equivalent EWOULDBLOCK)

This allows questionable code that tries to combine select and the
readline flavour of buffered I/O to limp along.  Such code is still
risky due to select() checking the underlying OS handle and not
the perl handle.

Fixes Perl#22883
tonycoz added a commit to tonycoz/perl5 that referenced this issue Jan 12, 2025
(or the equivalent EWOULDBLOCK)

This allows questionable code that tries to combine select and the
readline flavour of buffered I/O to limp along.  Such code is still
risky due to select() checking the underlying OS handle and not
the perl handle.

Fixes Perl#22883
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dist-IO issues in the dual-life blead-first IO distribution type-core
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants