tl;dr
The following subroutine safely quotes (escapes) a list of filenames (paths) on both Unix-like and Windows systems:
#!/usr/bin/env perl
sub quoteforshell {
return join ' ', map {
$^O eq 'MSWin32' ?
'"' . s/"/""/gr . '"'
:
"'" . s/'/'\\''/gr . "'"
} @_;
}
#'# Sample invocation
my $shellcmd = ($^O eq 'MSWin32' ? 'echo ' : 'printf "%s\n" ') .
quoteforshell('\\foo/bar', 'I\'m here', '3" of snow', 'bar |&;()<>#!');
print `$shellcmd`;
Output of the sample command on Unix-like systems, showing that all input arguments were passed through unmodified:
\foo/bar
I'm here
3" of snow
bar |&;()<>#!
On Unix-like systems, it should work with any strings (except ones with embedded NUL chars), not just filenames - see below for details.
On Windows, embedded "
instances are escaped as ""
, which is the only safe way to do it, but, sadly, may not be what the target program expects - see below for details; note, however, that this is not a concern if you're only passing filenames on Windows, because "
is not a legal filename character.
See the bottom of this post for a shell-less command-invocation alternative that bypasses the "
-quoting problem on Windows.
On Unix-like platforms, qx//
(the generalized form of `...`
) and the single-argument forms of system
and exec
invoke the shell by passing the command to /bin/sh -c
. /bin/sh
is assumed to be POSIX-compatible (and may or may not be Bash on a given system).
The single-argument forms of system
and exec
may or may not involve a shell - they decide based on the specific command passed whether involvement of a shell is needed. For instance, if a command has embedded (literal) single- or double-quotes, the shell is called. Since the solution below is based on embedding single-quoted tokens in the command string, it also works with the single-argument form of system
and exec
.
In POSIX-compatible shells you can take advantage of single-quoted strings, which do not interpolate their contents in any way.
The only challenge is to escape single-quotes ('
) themselves, which requires trickery, because, strictly speaking, embedding single-quotes in a single-quoted strings is not supported by the shell.
The trick is to replace every '
instance with '\''
(sic), which works around the problem by effectively splitting the input string into multiple single-quoted strings, with escaped '
instances - \'
- spliced in - the shell then reassembles the string parts into a single string.
Here's a subroutine that take a list of strings (filenames) and returns a space-separated string of quoted versions of the strings that guarantee literal use by the shell:
sub quoteforsh { join ' ', map { "'" . s/'/'\\''/gr . "'" } @_ }
Example (uses most POSIX shell metacharacters):
my $shellcmd = 'printf "%s\n" ' .
quoteforsh('\\foo/bar', 'I\'m here', '3" of snow', 'bar |&;()<>#!');
print `$shellcmd`;
This passes the following to /bin/sh -c
(shown here as a pure literal, without any quoting):
printf "%s\n" '\foo/bar' 'I'\''m here' '3" of snow' 'bar |&;()<>#!'
Note how each input string is in enclosed in single-quotes, and how the only character that needed quoting among all input strings was '
, which, as discussed, was replaced with '\''
.
This should output the input strings as-is, one on each line:
\foo/bar
I'm here
3" of snow
bar |&;()<>#!
On Windows, the analogous subroutine looks like this:
sub quoteforcmdexe { join ' ', map { '"' . s/"/""/gr . '"' } @_ }
This works analogous to quoteforsh()
above, except that
- double-quotes are used to enclose the tokens, because
cmd.exe
doesn't support single-quoting.
- the only character that needs escaping is
"
, which is escaped as ""
- note, however, that for filenames this isn't strictly necessary, because Windows doesn't allow "
instances in filenames.
However, there are limitations and pitfalls:
- You cannot suppress interpretation of references to existing environment variables, such as
%USERNAME%
; by contrast, non-existing variables or isolated %
instances are fine.
- Note: You should be able to escape
%
instances as %%
, but while that works in a batch file, it inexplicably doesn't work from Perl:
`perl "%%USERNAME%%.pl"`
complains, e.g., about %jdoe%.pl
not being found, implying that %USERNAME%
was interpolated, despite the doubled %
chars.
- (On the flip side, isolated
%
instances in double-quoted strings don't need escaping the way they do in batch files.)
- Escaping embedded
"
instances as ""
is the only SAFE way to do it, but it is not what most target programs expect.
- On Windows, incredibly, the required escaping is ultimately up to the target program - for full background, see https://stackoverflow.com/a/31413730/45375
- In short, the quandary is:
- If you escape for the target program - and most, including Perl, expect
\"
- then part of the argument
list may never be passed to the target program, with the remaining part either causing failure, unwanted redirection to a file, or, worse, unexpected execution of arbitrary commands.
- If you escape for
cmd.exe
, you may break the target program's parsing.
- You cannot escape for both.
- You can work around the problem if your command doesn't need involving the shell at all - see below.
Alternative: shell-less command invocation
If your command is an invocation of a single executable with all arguments to be passed as-is, there's no need to involve the shell at all, which:
- doesn't require quoting of the arguments, which notably bypasses the
"
-quoting problem on Windows
- is generally more efficient
The following subroutine works on both Unix-like systems and Windows, and is a shell-less alternative to qx//
(`...`
), which accepts the command to invoke as a list of arguments to interpret as-is:
sub qxnoshell {
use IPC::Cmd;
return unless @_;
my @cmdargs = @_;
if ($^O eq 'MSWin32') { # Windows
# Ensure that the executable name ends in '.exe'
$cmdargs[0] .= '.exe' unless $cmdargs[0] =~ m/\.exe$/i;
unless (IPC::Cmd::can_run $cmdargs[0]) { # executable not found
# Issue warning, as qx// would and open '-|' below does.
my $warnmsg = "Executable '$cmdargs[0]' not found";
scalar(caller) eq 'main' ? warn($warnmsg . "\n") : warnings::warnif('exec', $warnmsg);
return;
}
for (@cmdargs[1..$#cmdargs]) {
if (m'"') {
s/"/\\"/; # \-escape embedded double-quotes
$_ = '"' . $_ . '"'; # enclose as a whole in embedded double-quotes
}
}
}
open my $fh, '-|', @cmdargs or return;
my @lines = <$fh>;
close $fh;
return wantarray ? @lines : join('', @lines);
}
Examples
# Unix: $out should receive literal '$$', which demonstrates that
# /bin/sh is not involved.
my $out = qxnoshell 'printf', '%s', '$$'
# Windows: $out should receive literal '%USERNAME%', which demonstrates
# that cmd.exe is not involved.
my $out = qxnoshell 'perl', '-e', 'print "%USERNAME%"'
- Requires Perl v5.9.5+ due to use of
IPC::Cmd
.
- Note that the subroutines works hard to make things work on Windows:
- Even though the arguments are passed as a list,
open ..., '-|'
on Windows still falls back on cmd.exe
if the initial invocation attempt fails - the same applies to system()
and exec()
, incidentally.
- Thus, in order to prevent this fallback to
cmd.exe
- which can have unintended consequences - the subroutine (a) ensures that the first list argument is an *.exe
executable, (b) tries to locate it, and (c) only tries to invoke the command if the executable could be located.
- On Windows, sadly, any argument that contains embedded double-quotes is not passed through correctly to the target program - it needs escaping by (a) adding embedded double-quotes to enclose that argument, and (b) by escaping the original embedded double-quotes as
\"
.