1

I have the following perl code where I try to run all the html files against another script using the qx() operator.

#!/usr/bin/perl
use File::Basename;
use File::Spec::Functions qw(rel2abs);
use Getopt::Long;
use Env;

@all_files=glob("./report2/*html");
print "\nTotal files = $#all_files";

    
open(CONSOLIDATED,">/tmp/qtr_20230628.csv");
for my $file (@filter2)
{ 

    $qtr1=qx(./parse_qr_results.ksh "$file");
    #print "[$file]\n";
    print CONSOLIDATED "$qtr1";
}

I have 2 files with names as M&M.html, M&MFIN.html and these are having issues inside the qx(). How to solve this?.

Can I use inode inside glob() and pass that as parameter to the qx()?.

EDIT-2:

As requested, updating the parse_qr_results.ksh

Inside the parse_qr_results.ksh, I'm copying the content of the file using qx(cat $symbol)

$symbol_file=$ARGV[0];

$file_content=qx(cat '$symbol_file');
stack0114106
  • 8,534
  • 3
  • 13
  • 38
  • `use strict; use warnings;`, what are the undeclared variables? When I just use `print $file, "\n", qx(cat "$file"), "\n";` in the loop, everything works as expected (I get the contents of a file with an & in the name). Can you make a [mcve] please? – Robert Jun 30 '23 at 00:07
  • @Robert "_everything works as expected_" -- no, it doesn't. They need those filenames processed by a shell (script), not just `cat`-ed, and `&` has a special meaning. Need be escaped (see my answer) – zdim Jun 30 '23 at 00:09
  • @zdim Well, *I* expected to see the files contents, and that was the result, so I guess I should have said "works as I expected". No idea what OP expected. – Robert Jun 30 '23 at 00:11
  • @Robert "_No idea what OP expected_" -- as their question shows they expect the files with such names to be processed by a (K-) shell script. But you do get points for good humor :) – zdim Jun 30 '23 at 00:12
  • @zdim Thanks! :-) I use bash, maybe that makes a difference. – Robert Jun 30 '23 at 00:13
  • @Robert In other words, their code _is_ a minimal example (well, with some bloopers -- like it's not been shown how `@filter2` comes out of `@files`) – zdim Jun 30 '23 at 00:14
  • @Robert Try this in bash: `perl -wE'$s = q(ahM&M.txt); say $s; say for qx(ls -l $s)'`. There's no such file but that `&` clearly throws a wrench in the command line – zdim Jun 30 '23 at 00:16
  • 1
    @Robert On the other hand, with `"$s"` in there (quoted) it works as expected (`ls: cannot access 'ahM&M.txt': No such file or directory`) since that `&` is protected ... but if that is taken by a shell script, as they need it, then there's no telling how bad it could be. So I'd explicitly escape – zdim Jun 30 '23 at 00:24
  • @stack0114106, While the quoting you used isn't as robust as using `shell_quote` (or avoiding the shell entirely), it should have been sufficient to handle the files in question. If everything your post says is accurate, that points to a problem with `parse_qr_results.ksh` – ikegami Jun 30 '23 at 13:40
  • "_...a problem with `parse_qr_results.ksh`..._" -- I second that -- can you show relevant parts of that script? – zdim Jun 30 '23 at 17:20
  • @zdim updated the question to include the parse_qr_results.ksh – stack0114106 Jun 30 '23 at 19:09
  • @zdim.. for a file like M&MFIN.html, the parse_qr threw error .. The & sends the script to background leaving MFIN.html as one more executable call and fails again saying no such script.. the quotemeta didn't help here, but I used it for other string transformation. – stack0114106 Jun 30 '23 at 20:03
  • @stack0114106 "_The & sends the script to background leaving MFIN.html as one more executable call_" -- OK, so `&` is unhandled. (Strange, there's lots of quoting?) Btw, I've updated my post -- for one thing, to use the better-suited `String::ShellQuote::shell_quote` (rather than `quotemeta`). It could well be just extra quotes; I'll look into it later today – zdim Jun 30 '23 at 20:19
  • @stack0114106 As I looked nicely now, I am confused -- is that `parse_qr_results.ksh` a _Perl_ script? By the extension I assumed that it was a Korn shell script (and at a glance I thought I was missing things since I don't know much of that shell) ... but, no, Korn does not have such syntax at all, while it is valid Perl. Can you clarify? – zdim Jul 01 '23 at 07:38
  • yes, it is a perl script but I used ".ksh" extension..I think that confused you.. – stack0114106 Jul 01 '23 at 13:58
  • "_yes, it is a perl script..._" -- oh. Then just drop those single quotes around the filename. The escaping/quoting from the calling program (like shown in my answer for example) protects the `&` (and other shell-special characters) and then you just `cat` the file out of `qx`. But if this is really about opening a file then much better open it using Perl's tools -- either native or slurp-ing modules, if you indeed need a file contents in a scalar.) I've seen `File::Slurper` used, which seems to be good, and I've used `Path::Tiny::slurp` – zdim Jul 03 '23 at 17:43
  • (And if the shell is entirely avoided you shouldn't need to protect-quote/escape the filename either.) – zdim Jul 03 '23 at 17:56
  • I updated the answer and I'd leave it as it is. Out of curiosity, why name a Perl script with a `.ksh` extension? It can only throw people off? – zdim Jul 03 '23 at 17:59

2 Answers2

3

Update   It is clarified that the program that uses the file, parse_qr_resutls.ksh, is in fact a Perl script. The code in it that uses the file has single quotes around the filename, which it shouldn't have. Further, there are better ways to do what's shown in the question's edit -- to avoid a shell, or really to use Perl's File::Slurper or Path::Tiny::slurp if the need is indeed to "slurp" a file. See for example this post (and pay attention to updates). For other aspects of this question as well, also see this post.

But I'm leaving the answer as it is -- for an assumption that .ksh is a (Korn) shell file.


That glob will scoop up filenames fine (unless some have spaces in them, see File::Glob below), but the & mess up the shell later.

Escape/quote shell special characters, and String::ShellQuote is meant for that

use String::ShellQuote qw(shell_quote);

for my $file (@filter2) {
    my $file_esc = shell_quote $file;

    my $qtr1 = qx(./parse_qr_results.ksh "$file_esc");

    ...
}

Note that Shell::StringQuote is for bash while the program in the question that uses the filename appears to be a Korn shell script (parse_qr_results.ksh). However, shell's special characters are mostly the same and this may be good enough. A more generic tool for escaping special characters is quotemeta

The quotes in the question protect that filename at first but then it's taken by the shell script and we don't know what goes on there, so I'd specifically quote/escape with a library and hope for the best. (I'd really rather first rename files with such names...)

Also, with such funky filenames it may be a good idea to switch to File::Glob

use File::Glob qw(:bsd_glob);

Then glob gets replaced by bsd_glob and there is no need to change the code. But there is more you can do with it, see docs.

I strongly recommend use strict; and use warnings;, for code quality, time savings, sanity, and general health :)


In the end, a better way altogether is to use a library for running external commands, what brings a lot of improvements to the whole process. For one, they can easily bypass the shell and return all output as wanted. Then the error diagnostics is usually better.

Some libraries are IPC::Run (use run with a list, so to bypass the shell), IPC::System::Simple (see capturex), Capture::Tiny (use system in list form)

Since your command itself is a shell script I'd still protect filenames, but avoiding the first (qx's) shell in this way would still be helpful.


And if there is somehow a problem with installing extra libraries, one simple way to read a file into a scalar is

my $file_content = do { local (@ARGV, $/) = $filename; <> };

if we have the name of the file to slurp. Or

my $file_content = do { local $/; <> };

for a filename given on a command line and being the first thing in @ARGV (normally if there are no command-line options or after all have been processed and removed from @ARGV).

Or avoid the magic of <> entirely and use explicit open my $fh, ... to open a file and then read it with <$fh>, after undefining the local-ized input record separator $/ as above.

Note the closing ; which is necessary.

zdim
  • 64,580
  • 5
  • 52
  • 81
  • `quotemeta` is not appropriate for shell quoting. You want `shell_quote` from String::ShellQuote. Better yet, avoid the shell by using `capturex` from IPC::System::Simple. – ikegami Jun 30 '23 at 06:20
  • @ikegami Indeed, corrected, thank you. A problem is that the command they want to run is itself a shell script, and we don't know what it does with those filenames ... (I may add more tomorrow) – zdim Jun 30 '23 at 09:34
  • Re "*we don't know what it does with those filenames*", There could indeed be problems in `parse_qr_results.ksh`. And in fact, I suspect that is the case. While the quoting the OP used isn't as robust as using `shell_quote`, it should have been sufficient to handle the files in question. If everything the OP said is accurate, that points to a problem with `parse_qr_results.ksh` – ikegami Jun 30 '23 at 13:38
2

You can pass shell arguments as an array to 3-arguments open.

See: https://perldoc.perl.org/perlopentut

#!/usr/bin/perl

# Note: You should always use strict/warnings as zdim mentioned
use strict;
use warnings;

# Note: You should always check return value from open/close filehandles.
# `autodie` will automatically do that for you.
use autodie;

use File::Basename;
use File::Spec::Functions qw(rel2abs);
use Getopt::Long;
use Env;

# Note: `$#arr` will return the last index of array, not array size
my @all_files = glob './report2/*html';
print "\nTotal files = " . scalar @all_files;


# Note: Use 3-arguments open
open my $fh_consolidated, '>', "/tmp/qtr_20230628.csv";
for my $file (@filter2) {

    # Note: Pass shell arguments as an array. Capture output using '-|'.
    open my $fh_out, '-|', './parse_qr_results.ksh', $file;
    while (<$fh_out>) {
        print {$fh_consolidated} $_;
    }

    # Note: Don't forget to close filehandles.
    close $fh_out;
}

close $fh_consolidated;
ernix
  • 3,442
  • 1
  • 17
  • 23