2

I don't know anything about signals, and only a little about pipes.

From the comments on zdim's answer here it seems that signals may interfere with pipe communication between parent and child processes.

I was told that, if you're using IO::Select and sysread, then the exit of a child process could somehow mess up the behavior of IO::Select::can_read, especially if there are multiple child processes.

Please describe how to account for signals when using pipes? The below code is an example where signals are not accounted for.

use warnings;
use strict;
use feature 'say';

use Time::HiRes qw(sleep);
use IO::Select; 

my $sel = IO::Select->new;

pipe my $rd, my $wr;
$sel->add($rd); 

my $pid = fork // die "Can't fork: $!";  #/

if ( $pid == 0 ) {     # Child code

    close $rd; 
    $wr->autoflush;

    for ( 1..4 ) {

        sleep 1;
        say "\tsending data";
        say $wr 'a' x ( 120 * 1024 );
    }

    say "\tClosing writer and exiting";
    close $wr;

    exit; 
}

# Parent code
close $wr;    
say "Forked and will read from $pid";

my @recd;

READ:
while ( 1 ) {

    if ( my @ready = $sel->can_read(0) ) {  # beware of signals

        foreach my $handle (@ready) {

            my $buff;
            my $rv = sysread $handle, $buff, ( 64 * 1024 );
            warn "Error reading: $!" if not defined $rv;

            if ( defined $buff and $rv != 0 ) {
                say "Got ", length $buff, " characters";
                push @recd, length $buff; 
            }

            last READ if $rv == 0;
        }
    }
    else {
        say "Doing else ... ";
        sleep 0.5; 
    }
}   
close $rd;

my $gone = waitpid $pid, 0;

say "Reaped pid $gone";
say "Have data: @recd"
Community
  • 1
  • 1
Stephen
  • 8,508
  • 12
  • 56
  • 96
  • Again, are you married to using pipes for this purpose? With a file for interprocess communication, you will have less hassles with capacity, deadlocks, signals, and portability. And debugging (at the end of the program you can inspect the file and see if it contains what you expect). – mob Feb 01 '18 at 18:32
  • I was trying to use pipes because it seems like I really should learn how they work to be a good engineer. But it certainly does seem excessively complex by comparison. I feel like I’m making progress with a good setup but I haven’t reached a point of comfort where I think it should be sufficiently reliable – Stephen Feb 01 '18 at 18:35
  • Pipes would also be a lot faster I believe – Stephen Feb 01 '18 at 18:36
  • But yes, maybe I will abandon this effort :( – Stephen Feb 01 '18 at 18:37
  • 1
    If you're not comfortable with IPC in general then switching to using files won't help much: many of the same issues will arise, and in the end you must be aware of the consequences of running processes in parallel while sharing a resource. I assume you've read and absorbed [`perldoc perlipc`](https://perldoc.perl.org/perlipc.html)? – Borodin Feb 01 '18 at 19:10
  • @Borodin if signals can interfere with pipes then by moving to files I will avoid that potential interference. I don't really get why signals should interfere with pipes, I'm just going off of what zdim said in the comments on the linked post. – Stephen Feb 01 '18 at 19:22
  • @Steph: *"I don't really "understand" it"* This is nothing to do with comprehension: *"2.1 [with clause] Infer something from information received (often used as a polite formula in conversation) ‘Apart from the art department, I understand that the school gives a pretty good education’* (Oxford dictionaries) – Borodin Feb 04 '18 at 10:33
  • @Borodin for context for everybody else this is related to a discussion of proposed edits. Yes, I'm familiar with that sense of the word as well. Still, I wanted to ensure somebody less familiar with English wouldn't assume the other definition of the word. – Stephen Feb 04 '18 at 18:50
  • @Stephen: I'm sorry but that's ridiculous. To do that comprehensively would be all but impossible, and the interpretation that you meant doesn't make sense in this context. It's best to just write good English. I'd appreciate it if you would avoid removing other people's edits unless they are genuinely wrong or confusing. – Borodin Feb 04 '18 at 19:23
  • It's my question, I think I should get some deference in how I'd like to phrase it. And my own edit makes perfect sense and is indeed how it actually happened. – Stephen Feb 04 '18 at 19:24

2 Answers2

4

Two things.

  1. Writing to a pipe after the reader was closed (e.g. perhaps because the process on the other end exited) leads to a SIGPIPE. You can ignore this signal ($SIG{PIPE} = 'IGNORE';) in order to have the write to return error EPIPE instead.

    In your case, if you wanted to handle that error instead of having your program killed, simply add

    $SIG{PIPE} = 'IGNORE';
    
  2. If you have any signal handler defined (e.g. using $SIG{...} = sub { ... };, but not $SIG{...} = 'IGNORE'; or $SIG{...} = 'DEFAULT';), long-running system calls (e.g. reading from a file handle) can be interrupted by a signal. If this happens, they will return with error EINTR to give the signal handler a chance to run. In Perl, you don't have to do anything but restart the system call that failed.

    In your case, you have no signal handlers defined, so this doesn't affect you.


By the way, you check $rv == 0 even when $rv is known to be undefined, and you place the length of the data in @recd instead of the data itself. In fact, it doesn't make much sense to use an array there at all. Replace

my @recd;

...

my $rv = sysread $handle, $buff, ( 64 * 1024 );
warn "Error reading: $!" if not defined $rv;

if ( defined $buff and $rv != 0 ) {
    say "Got ", length $buff, " characters";
    push @recd, length $buff; 
}

last READ if $rv == 0;

...

say "Have data: @recd"

with

my $buf = '';

...

my $received = sysread($handle, $buf, 64 * 1024, length($buf));
warn "Error reading: $!" if !defined($received);
last if !$received;

say "Got $received characters";

...

say "Have data: $buf"
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • Thanks, this provides more context that I was missing. I understand now that the interruption occurs so that signal handlers I might have defined can proceed. – Stephen Feb 05 '18 at 17:00
  • One thing I don't understand: How could the write return `EPIPE`? Do you mean that the return value of the write condition itself would be a string EPIPE? How would that be better? – Stephen Feb 05 '18 at 17:04
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/164587/discussion-between-stephen-and-zdim). – Stephen Feb 06 '18 at 03:52
  • @zdim since you're making corrections to your posts... I think it is wrong to loop over handles with `foreach my $handle (@ready)` when you know that there will be only one handle. In particular, since you are invoking the command `last READ;` the moment that you see a closed handle, the code won't even work with multiple handles. – Stephen Feb 06 '18 at 04:06
  • @ikegami, one thing that could be elaborated in the post: Why don't we ever try to re-read the handle if we encounter a warning ($rv is not defined). Couldn't there be a recoverable error? Maybe you explained this to zdim but the comments are deleted now. – Stephen Feb 06 '18 at 04:22
  • As already mentioned, you should retry if you get `EINTR`, except it's not possible to get `EINTR` error in your program. – ikegami Feb 06 '18 at 04:52
2

Signals may also interrupt I/O functions, causing then to fail with $! set to EINTR. So you should check for that error and retry when it happens.

Not doing it is a common source of hard to find bugs.

salva
  • 9,943
  • 4
  • 29
  • 57