1

I have a perl script that I call with -p and -f options. I'd like to pass command line parameters to ARGV in the script.

For example, opl.pl is a script that concatenates each line that doesn't start with xx onto the previous line that starts with xx, with '#' as a separator, after flagging pre-existing '#' characters:

# Usage: perl -pf opl.pl file.txt
BEGIN {$recmark = @ARGV[0] if $#ARGV; }
$recmark  = "xx" if (! defined $recmark);
chomp;
print "\n" if /$recmark/;
s/#/\_\_hash\_\_/g;
$_ .= "#"

The script works when no additional parameters are on the command line. E.g., perl -pf oplx.pl filexx.txt with filexx.txt:

xx line #1
line 2
line 3
xx line 4
line 5

Produces (aproximately):

xx line __hash__1#line 2#line 3
xx line 4#line 5

I'd like to use perl -pf oplx.pl filexyy.txt yy with fileyy.txt:

yy line #1
line 2
line 3
yy line 4
line 5

to produce (aproximately):

yy line __hash__1#line 2#line 3
yy line 4#line 5

Unfortunately, perl parses the command line argument yy as a file name, rather than as an argument.

Wes
  • 423
  • 3
  • 12
  • BTW: I'm aware of and willing to live with the facts that: 1) The script makes an initial blank line and omits a \n from the final line. 2) The script unnecessarily checks for the definition of the $recmark variable each line. – Wes Jan 08 '19 at 05:35
  • Generally speaking, this is answered by [How can I process options using Perl in -n or -p mode?](https://stackoverflow.com/q/53524699/589924). But in this specific case, the better solution is to stop using `-p` as the answers say. – ikegami Jan 08 '19 at 07:07
  • @ikegami, already using the script in various bash scripts and have other users who use the script. Dropping the -p option is not an option for me. – Wes Jan 08 '19 at 11:37
  • 2
    You really should NEVER pass `-p` to a script. It makes absolutely no sense for the caller to provide part of the program. And of course, you can get into this very problem. – ikegami Jan 08 '19 at 12:23
  • @ikegami, Next time I'll be prepared for last time. – Wes Jan 08 '19 at 19:01
  • @Wes With this question you _are_ going to be making changes to how the script is used, no? Then you can include the change whereby `-p` is no longer needed, and in the script add `while (<>) { }` and process files inside, as I show in my answer for example. (Not to mention that you should introduce proper, named arguments; it will make it much better.) – zdim Jan 08 '19 at 19:37
  • If you wanted to ignore `-p` if it's provided, you can use something like `BEGIN { if (!@ARGV || $ARGV[0] ne 'deflagged') { exec $^X, '--', $0, 'deflagged', @ARGV or die $!; } shift @ARGV; }`. – ikegami Jan 09 '19 at 02:26

3 Answers3

1

The -n command switch

causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like sed -n or awk:

LINE:
while (<>) {
    ...       # your program goes here
}

where <> filehandle is special as

Input from <> comes either from standard input, or from each file listed on the command line.

In other words, it reads lines from all files given on the command line. The -p does the same except that it also prints $_ every time through.

Those filenames are found in @ARGV variable, which in your example has filexyy.txt and yy, and which are thus treated as filenames.

One solution: remove the needed parameters (yy here) from @ARGV, in a BEGIN block. Then the operation of <> will indeed have only filenames to work with.

This brings up the question of your program's desired interface. If you wish that parameter to be supplied last on the command line

my $param;
BEGIN {
    $param = pop @ARGV;
}

since pop removes from the back of an array; if you want the parameter to be given first then use shift. Note that your $recmark would also have to be removed from @ARGV.

Keeping track of all this is error prone, and inconvenient both for use and further work.

It would be far better to process those arguments using a good module, like Getopt::Long. Then you can give them names, easily change the interface as need arises, and have each invocation properly checked by the module.

Also note that with filenames in @ARGV, which is what remains after you (or Getopt::Long) are done with options, you can process all lines from all files inside of

while (<>) { ... }

using the same <> mentioned above. Inside a script this is far better than -p.

zdim
  • 64,580
  • 5
  • 52
  • 81
  • Heh. Posted right before I was about to finish my answer. :) –  Jan 08 '19 at 06:10
  • @Alhadis Yeah, around 6 minutes. It's happened to me as well (not once), and likely to everyone – zdim Jan 08 '19 at 07:20
1

From the perlrun(1) man page:

-p
causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like sed:

 LINE:
   while (<>) {
       ...             # your program goes here
   } continue {
       print or die "-p destination: $!\n";
   }

The most appropriate use of the -p switch is for one-liners, where each file argument is processed in-turn, line-by-line, with the result of the program's execution printed to stdout.

Perl's angle-brackets, which were implicitly added by the -p switch, take a filehandle as input, and iterates through each line until EOF is reached:

while(<$opened_file_handle>) {
    …
}

HOWEVER, if no filehandle is passed, the angle-brackets will default to @ARGV, treating each available argument as a filename. If @ARGV is empty, <> falls back to standard input (equivalent to using <STDIN>).

If you want to pass both arguments and filenames on command-line, you have two choices:

  1. Order the arguments so the non-file-related args come first, like this:

    perl -f opt.pl ABC XYZ file1.txt file2.txt
    

And in your script:

my $first = shift;  # Modifies @ARGV in-place, placing "ABC" in $first
my $second = shift; # Same again, this time plucking "XYZ" from @ARGV and putting it in `$second`
  1. Or, use the Getopt::Long module to pass the non-filename arguments as switches (or "options"):

    perl -f opt.pl --foo ABC --bar XYZ  file1.txt file2.txt …
    

And the Perl code to do that:

use Getopt::Long;
my $foo = "";
my $bar = "";
GetOptions("foo=s" => \$foo, "bar=s" => \$bar);

Using Getopt::Long is the cleaner (and recommended way) to pass arguments while processing a list of files.

Hope this helps!

Wes
  • 423
  • 3
  • 12
1

Consider using an environment variable as an alternative to mucking about with your command-line arguments.

recmark=yy perl -pf opl.pl file1 file2 ...

BEGIN { $recmark = $ENV{recmark} // "xx" };
...
mob
  • 117,087
  • 18
  • 149
  • 283
  • Would do this except the script has to run in Windows (Strawberry perl) as well. – Wes Jan 08 '19 at 11:29