4

A fellow Stackoverflower tried to use @ARGV in his END block but was unable to.

Why is it that @ARGV is only defined inside the BEGIN block with the following one-liner:

$ perl -lne 'BEGIN{ print "BEGIN"  if @ARGV }
                    print "MIDDLE" if @ARGV }
                  { print "END"    if @ARGV  ' file
  BEGIN

perldoc perlrun doesn't shed any light on the matter. What's going on here?

Community
  • 1
  • 1
Zaid
  • 36,680
  • 16
  • 86
  • 155
  • @PaulTomblin : Nope. That quirky thing Perl allows you to do is affectionately called the [Eskimo kiss](http://stackoverflow.com/q/2897853/133939) `}{` – Zaid Jan 02 '12 at 16:47
  • @JonathanLeffler : The `END` is implicit in the `}{`. One could also have written it as `} END {` or `statement; END {`. This is something only possible with a one-liner. As for the three lines, there is nothing stopping me from writing it all out on one line but I hate it when the code activates the scroll bar. – Zaid Jan 02 '12 at 17:03
  • Run the original script with 2 or more file names. Then `@ARGV` is defined in the middle. It still isn't defined when the block with the implied END (it isn't strictly the END block; it is just a block that is executed after the `while (<>){ ... }` loop has completed) is executed, because all the arguments have been shifted out of `@ARGV` by that time. – Jonathan Leffler Jan 02 '12 at 17:38
  • I assume there is some implicit shifting going on, but it is strange that it is not mentioned anywhere. `$ARGV` is probably assigned `shift @ARGV` during the implicit `open`, e.g. `$ARGV = shift @ARGV; open ARGV or warn ...` – TLP Jan 02 '12 at 21:30

2 Answers2

5

First, arrays cannot be undefined. You are checking if the array is empty. To understand why it's being emptied, you need to understand -n. -n surrounds your code with

LINE: while (<>) {
   ...
}

which is short for

LINE: while (defined($_ = <ARGV>)) {
   ...
}

ARGV is a magical handle that reads through the files listed in @ARGV, shifting out the file names as it opens them.

$ echo foo1 > foo
$ echo foo2 >>foo

$ echo bar1 > bar
$ echo bar2 >>bar

$ echo baz1 > baz
$ echo baz2 >>baz

$ perl -nlE'
    BEGIN { say "Files to read: @ARGV" }
    say "Read $_ from $ARGV. Files left to read: @ARGV";
' foo bar baz
Files to read: foo bar baz
Read foo1 from foo. Files left to read: bar baz
Read foo2 from foo. Files left to read: bar baz
Read bar1 from bar. Files left to read: baz
Read bar2 from bar. Files left to read: baz
Read baz1 from baz. Files left to read:
Read baz2 from baz. Files left to read:

Keep in mind that BEGIN blocks are executed as soon as they are compiled, so the <ARGV> hasn't yet been executed when the BEGIN block is being executed (even though it appears earlier in the program), so @ARGV hasn't been modified yet.

-n is documented in perlrun. ARGV, @ARGV and $ARGV are documented in perlvar.

ikegami
  • 367,544
  • 15
  • 269
  • 518
  • Nice clear example here. Unfortunately `perlrun` doesn't mention anything about the implicit shifting. Nor does `perlvar`. – Zaid Jan 03 '12 at 07:11
  • 1
    @Zaid, oh indeed. I'll submit a patch tomorrow if I remember. – ikegami Jan 03 '12 at 09:09
4

A BEGIN block runs before anything else. At that point, @ARGV has everything being passed and a test for non-emptiness returns true. When the END block runs, the elements of the original @ARGV have been shifted away by the implicit while(<>) {...} loop generated by the '-n' switch. Since there is nothing left, the empty @ARGV tests false. Change the END block to:

{print "END" if defined @ARGV}

As each element of @ARGV is shifted, it is stored in $ARGV. Hence, the block could be also rewritten:

{print "END" if $ARGV}
JRFerguson
  • 7,426
  • 2
  • 32
  • 36
  • I don't think that's the case since `perlrun` says the `-n` switch isn't much more than an implicit `while(<>){...}` loop. Doesn't say anything about `@ARGV` being consumed. – Zaid Jan 02 '12 at 17:08
  • Of course, I could be misreading the perldocs or be misled by it. Both have happened in the past :( – Zaid Jan 02 '12 at 17:10
  • 1
    It appears that a `while (<>) { .. }` loop (independent of the use of `-p` or `-n`) does shift elements off @ARGV as it processes them. Try: `perl -e 'while(<>){print "CHECK: @ARGV\n";}' x y z | uniq` (where `x`, `y` and `z` should be files with at least one line in them). You could add a BEGIN too, if you want. You'll see that @ARGV is shifted as the files are processed. – Jonathan Leffler Jan 02 '12 at 17:32
  • @JRFerguson Do you know this, or are you speculating? Do you have a link to somewhere this is mentioned? – TLP Jan 02 '12 at 21:31
  • 2
    @TLP Yes, I know this from my own heuristics as well as Jonathan's. As for independent documentation, the Perl Cookbook notes, "...the file-processing loop removes one argument at a time from '@ARGV' and copies the filename into the global variable $ARGV. If the file cannot be opened, Perl goes on to the next one. Otherwise, it processes a line at a time. When the file runs out, the loop goes back and opens the next one, repeating the process until '@ARGV' is exhausted." – JRFerguson Jan 02 '12 at 22:02
  • The first replacement you suggest is very bad. 1) The code you suggest using doesn't compile. 2) `if defined(@ARGV)` is worse than the original and correct `if @ARGV`. (It doesn't actually check if @ARGV is empty like the OP's code, so it's overly complex, and could return false positives.) – ikegami Jan 03 '12 at 01:38
  • 1
    The second code replacement is even. 1) It also doesn't compile. 2) `$ARGV` will only be false if the line you are reading comes from a file `0`. Why would you check that? – ikegami Jan 03 '12 at 01:40
  • @ikegami Thanks for the corrections and insights and for the very clear example to the OP. – JRFerguson Jan 03 '12 at 13:05