2

Consider this Python script:

for i in range(4000):
    print(i)

and this Perl script:

for my $i (0..4000-1) {
    print $i, "\n";
}

python3 pipe.py | head -n3000 >/dev/null produces an error:

Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe

but

perl pipe.pl | head -n3000 >/dev/null produces no error (in Perl v5.26.1).

Why such a discrepancy between Python and Perl? How to make Perl to produce a similar error message?

porton
  • 5,214
  • 11
  • 47
  • 95
  • @porton, That leads to SIGPIPE being sent to the process, and the default behaviour of SIGPIPE is to kill the receiving process. The exit status of the `perl` process will reflect this. Python must change the default. – ikegami Aug 30 '18 at 18:57

2 Answers2

9

What's going on here is that in both cases you have a process writing to a pipe whose read end was closed (by head exiting after a certain number of bytes).

This causes a SIGPIPE signal to be sent to the writing process. By default this kills the process. The process can ignore the signal if it wants to, which just makes the write call fail with an EPIPE error.

Starting with version 3.3, Python raises a BrokenPipeError exception in this case, so it looks like Python 1) ignores SIGPIPE by default and 2) translates EPIPE to a BrokenPipeError exception.

Perl does not ignore or handle signals by default. That means it gets killed by SIGPIPE in your example, but because it is not the last command in a pipeline (that would be head here), the shell just ignores it. You can make it more visible by not using a pipeline:

perl pipe.pl > >(head -n3000 >/dev/null)

This piece of bash trickery makes perl write to a pipe, but not as part of a shell pipeline. I can't test it now, but at minimum this will set $? (the command exit status) to 141 in the shell (128 + signal number, which for SIGPIPE is 13), and it may also report a Broken pipe.

You can deal with it manually in the Perl code, though:

  • Variant 1: Throw an error from the signal handler

    $SIG{PIPE} = sub { die "BrokenPipeError" };
    
  • Variant 2: Ignore the signal, handle write errors

    $SIG{PIPE} = 'IGNORE';
    ...
    print $i, "\n" or die "Can't print: $!";
    

    Note that in this case you have to think about buffering, however. If you don't enable autoflush (as in STDOUT->autoflush(1)) and output is going to a pipe or file, Perl will collect the text in an internal buffer first (and the print call will succeed). Only when the buffer gets full (or when the filehandle is closed, whichever happens first) is the text actually written out and the buffer emptied. This is why close can also report write errors.

melpomene
  • 84,125
  • 8
  • 85
  • 148
4

The python exception is raised since the reading process (head) closes its end so the script receives SIGPIPE the next time it attempts to write; see this post. This involved decisions in Python community, to change defaults so to inform the user (see the linked post).

This is not seen in Perl because it gets killed by that signal (what is its disposition) without saying anything. So you could override that

use warnings;

$| = 1;

$SIG{PIPE} = sub { die $! };

for my $i (0..4_000-1) {
    print $i, "\n";
}

(without the $| = 1 I need more than 5_000 above for it to happen.)

Or, rather issue a warning (instead of die) so that the program continues

local $SIG{PIPE} = sub { warn "Ignoring $_[0]: $!" };

Update   Given the clarification provided in a comment I'd recommend this handler to be global in fact. It can still be overriden with a local one in particular scopes. Besides, there is nothing wrong with surviving a SIGPIPE instead of being terminated, as long as there is a warning.

Note that even without that the exit status of the Perl process will show the problem. Run echo $? after the process "completes" (is quietly terminated); I get 32 on my system.

To mimic Python's behavior further you could issue a die in the signal handler and handle that exception, by putting it all in eval.

Thanks to melpomene and ikegami for comments.

zdim
  • 64,580
  • 5
  • 52
  • 81
  • I'm pretty sure perl is not supposed to handle SIGPIPE by default. – melpomene Aug 30 '18 at 18:57
  • @melpomene Perhaps in particular situations? If it gets a `SIGPIPE` it must do something or it would be killed. I think we'd be upset if scripts printed to STDERR for over 4kB. Apparently there was a good deliberation in Python community about informing the user, etc. – zdim Aug 30 '18 at 19:00
  • I'm also pretty sure it is getting killed. – melpomene Aug 30 '18 at 19:01
  • OK, I got it -- it appears that it does killed, quietly. (Of course I get output since that's when it gets killed!) .... testing before edit ... – zdim Aug 30 '18 at 19:07
  • But some of our Perl scripts (after upgrading to a newer version of Perl) produces "Unable to flush stdout: Broken pipe" in the logs. What in Perl can cause this message? (This is the very question I need to answer. I was assigned the task to localize and eliminate this error, given that we don't yet know which of our Perl scripts causes this.) – porton Aug 30 '18 at 22:49
  • @porton After reading that question over (and melpomene's answer!) ... perhaps a signal handler won't do just so. Ideas: either force a flush of stdout and catch (a presumed sigpipe) in the code, or do that at the very end. That way you do what the interpreter seems to be doing but with a chance to catch the signal. I presume that there is an actual signal. – zdim Aug 30 '18 at 23:52