4

I have the following code:

#!/usr/bin/env perl

use 5.0360;
use warnings FATAL => 'all';
use autodie ':default';
use Devel::Confess 'color'; # not essential, but better error reporting

open my $view, "zcat a.big.file.vcf.gz|"; # zcat or bcftools
while (<$view>) {
    next unless /^#CHROM\t/;
    last;
}
close $view;

the above code crashes with the error

Can't close(GLOB(0x55adfa96ebf8)) filehandle: '' at (eval 11)[/home/con/perl5/perlbrew/perls/perl-5.36.0/lib/5.36.0/Fatal.pm:1683] line 74
 at (eval 11)[/home/con/perl5/perlbrew/perls/perl-5.36.0/lib/5.36.0/Fatal.pm:1683] line 74.
    main::__ANON__[(eval 11)[/home/con/perl5/perlbrew/perls/perl-5.36.0/lib/5.36.0/Fatal.pm:1683]:86](GLOB(0x55adfa96ebf8)) called at mwe.pl line 13
Command exited with non-zero status 255

However, if I comment out last the code runs without a problem, however, the file is huge and this makes a significant difference in running time.

The code also works if I remove close $view but close is proper practice.

How can I run the code with both last and close $view?

zdim
  • 64,580
  • 5
  • 52
  • 81
con
  • 5,767
  • 8
  • 33
  • 62
  • Your entire loop does nothing. Your sample code has no way of succeeding or failing, beyond exiting unfavourably. It either runs a long time, or a short time. If I were to guess what the errors are about, I would say it is about fatal warnings and autodie. Remove those, and your code will likely improve. – TLP Jul 06 '22 at 21:53
  • @TLP I need autodie. The loop does nothing because this is a minimal working example of much larger code – con Jul 06 '22 at 22:04
  • You never "need" autodie, you can just check return values on your own and `die` where you need to. `autodie` is just a convenience. Although `Fatal.pm` in the error is likely from the warnings pragma, not autodie. That will cause your code to "crash" every time there is a warning. Closing a file handle is not required, it will close automatically when the program ends of the file handle goes out of scope. – TLP Jul 06 '22 at 22:24
  • Your example is so minimal that it is almost pointless. Perhaps you should take a look at the core module [IO::Uncompress::Gunzip](https://perldoc.perl.org/IO::Uncompress::Gunzip). – TLP Jul 06 '22 at 22:29

1 Answers1

5

When you last out of reading that process (zcat here) and close the pipe, before the process is done writing, the process gets a SIGPIPE

Closing the read end of a pipe before the process writing to it at the other end is done writing results in the writer receiving a SIGPIPE

So as close then waits on it it gets a non-zero, and returns false (as seen below). That's all. The rest -- the program "crashing" -- is up to autodie, which throws an exception. Without autodie (or fatal warnings)

use warnings;
use strict;
use feature 'say';

use Scalar::Util qw(openhandle);

my $file = shift // die "Usage: $0 file\n";

open my $view, "zcat $file |" or die "Can't pipe-open zcat: $!";

while (<$view>) {
    next unless /^#CHROM\t/;  # (added to my test file)
    say "Matched --> $_";
    last;
}

say "Filehandle good? --> ", openhandle($view) // 'undef';  # GLOB(...)
    
close $view or warn $!  
    ? "Error closing pipe: $!" 
    : "Command exit status: $?";                            # 13

say "Program received a signal ", $? & 127  if $? & 127;    # 13

I get Command exit status: 13, so close didn't return true while $! is false. This indicates that the only issue was the non-zero status

If the filehandle came from a piped open, close returns false if one of the other syscalls involved fails or if its program exits with non-zero status. If the only problem was that the program exited non-zero, $! will be set to 0.

We did get a signal, 13 (for sigpipe, see man 7 signal), what terminated the program and so there was no particular exit code ($? >> 8 is indeed zero), and no core was dumped ($? & 128 is zero). See $? in perlvar

Since the exit status is non-zero close returns false and autodie throws its exception.


So what to do with this?

That close must stay and be checked, of course.

Even if the SIGPIPE sent to zcat could be ignored, as I've seen claimed in some docs, you wouldn't want that -- it's there on purpose, to let the writer know that there are no readers so that it can stop!

Finally, it is autodie that kills the program, and it can be disabled lexically. (This satisfies the "need" for autodie stated in a comment.) So put this pipe reading and early close in a block

READ_FROM_PIPE: { 
    no autodie qw(close);
    # do the pipe-open, read from the pipe, close() it...
};
# rest of code 

Don't forget to suitably adjust your other error handling in this code.

(I've had weird experience with no autodie "leakage" out of its scope, see here. But that was a different case, and fixed by autodie 2.30 so hopefully not of concern.)

Another option is to wrap all this, or just the close(), in eval instead. This is considered good practice, they say in docs. Then see how to work with autodie exceptions.

zdim
  • 64,580
  • 5
  • 52
  • 81
  • 1
    Re "*As for why `$!` is emtpy*", I meant the `''` of `Can't close(GLOB(0x55adfa96ebf8)) filehandle: ''` This is probably the result of `close` error handling needs to be quite different for for `open -|` handles – ikegami Jul 07 '22 at 13:52
  • @ikegami Oh, right, that ... looks like that's on `autodie`? I mean there is no error to quit over really, just that non-zero exit (with signal code only). It seems that when a subprocess is killed by a signal there is nothing in `$!`, so `autodie` should either add a message or just leave it altogether? – zdim Jul 07 '22 at 16:37
  • @ikegami Or I suppose `close` should fill `$!` for when a signal is all that there is. But then that'd be for other calls as well -- I tried with killing a `system` command externally and I got `15` in `$?` and empty `$!` – zdim Jul 07 '22 at 16:39