How can I detect end of file on a pipe in a Perl script?

Question

In a Perl script, I am running another process (openssl) and communicating with it via pipes. I run openssl with IPC::Run::start.

$openssl_stdin_handle = Symbol::gensym();
$openssl_stdout_handle = Symbol::gensym();
$openssl_stderr_handle = Symbol::gensym();
$harness = IPC::Run::start(\@openssl_command_words, '<pipe', $openssl_stdin_handle, '>pipe', $openssl_stdout_handle, '2>pipe', $openssl_stderr_handle);
my $io_select_for_stdin = IO::Select->new($openssl_stdin_handle);
my $io_select_for_stdout = IO::Select->new($openssl_stdout_handle);
my $io_select_for_stderr = IO::Select->new($openssl_stderr_handle);

I write plaintext data to openssl with $io_select_for_stdin->can_write(4) and syswrite. I read encrypted data from openssl with $io_select_for_stdout->can_read($timeout) and sysread. I can't write all the plaintext data and then read all the encrypted data because, if there is a lot of plaintext data, the pipe from openssl's standard output fills up and openssl hangs. Instead, I have to write some plaintext, see if there is any encrypted data to read, read it if available, then go back and write some more.

Once I'm finished writing plaintext to openssl, I close $openssl_stdin_handle and call $io_select_for_stdout->can_read(1). Here is the code (it is within a loop labeled READ_WRITE_LOOP):

$OS_ERROR = 0
if  (not $io_select_for_stdout->can_read(1))
    {
    if  ($OS_ERROR == 0) {last READ_WRITE_LOOP;}   # nothing to read -- not an error
    return "error";
    }

my $number_of_cyphertext_bytes_read = sysread($openssl_stdout_handle, $cyphertext_block, $block_size);

The problem is that can_read could return false because openssl is not ready to write, or because openssl has finished writing and has closed its standard output. In the first case, I want to retry the read, in the second case I want to close $openssl_stdout_handle and move on to the next phase of the program. How do I distinguish between these two cases?

The code above is working because I have a timeout set to one second for can_read. If I shorten the timeout to .01 seconds, the code works sometimes and not others. I'd like a shorter timeout but I need the code to work every time.

I know that calling eof($openssl_stdout_handle) is not the answer. You can't mix calls to sysread and eof. Is there a clean way to detect whether openssl has closed its side of the pipe that I access via $openssl_stdout_handle?

There could be a particular error constant (`%!`) set? They are listed in [POSIX#ERRNO](https://perldoc.perl.org/POSIX#ERRNO) — zdim, Dec 17 '22 at 22:08
What I see is that the can_read method sets $OS_ERROR to zero if there is no data to read or if end-of-file is reached. Reaching the end of file (really end of pipe in this case) is not an error. — David Levner, Dec 19 '22 at 23:51
No, that is wrong. `$!` (`$OS_ERROR`) is only meaningful when a system indicated an error occured. Reaching EOF is not an error. When `can_read` returns true (because the handle has data to read or you have reached EOF), it will return true, and `$!` will be meaningless. `$!` does not help you detect EOF. — ikegami, Dec 20 '22 at 04:19
The [IO::Select documentation](https://perldoc.perl.org/IO::Select) for `can_read` states: "To distinguish between timeout and error, set `$!` to zero before calling this method, and check it after an empty list is returned." So checking `$!` is useful for distinguishing between errors and timeout, but not for detecting eof. — David Levner, Dec 21 '22 at 13:26

ikegami · Accepted Answer · 2022-12-19T16:10:59.643

3

To answer the title question, sysread returns zero on EOF (not to be confused with undef which is returned on error), pipe or not.

As for the actual question, it seems to me you could use the following:

run \@cmd, '<', \$unencrypted, '>', \my $encrypted;

If you don't want to hold the entire file in memory, you can use a callback (sub { }) instead of a pipe for the output.

run \@cmd,
   '<', sub {
      # Called repeatedly until it returns `undef`.
      # Return a string of bytes to send, or `undef` to signal eof...
   },
   '>', sub {
      # Do something with the string of bytes provided as argument...
   };

edited Dec 19 '22 at 16:10

answered Dec 17 '22 at 22:40

ikegami

367,544
15
269
518

I believe both methods work but I am doing more testing and will leave a more detailed comment soon. – David Levner Dec 20 '22 at 03:28
I tested using pipes, scalars and subroutines with IPC::Run and they all work. Subroutines take slightly longer with large amounts of data. Scalars are the simplest solution (and make the original question moot), so I will go with them. Thanks to everyone who replied. – David Levner Dec 22 '22 at 20:41

How can I detect end of file on a pipe in a Perl script?

1 Answers1