11

I'd like to print a string literal in AWK / gawk using the PowerShell command line (the specific program is unimportant). However, I think I misunderstand the quoting rules somewhere along the line -- PowerShell apparently removes double quotes inside single quotes for native commands, but not when passing them to commandlets.

This works in Bash:

bash$ awk 'BEGIN {print "hello"}'
hello    <-- GOOD

And this works in PowerShell -- but importantly I have no idea why the escaping is needed:

PS> awk 'BEGIN {print \"hello\"}'
hello    <-- GOOD

This prints nothing in PowerShell:

PS> awk 'BEGIN {print "hello"}'
    <-- NOTHING IS BAD

If this really is the only way of doing this in PowerShell, then I'd like to understand the chain of quoting rules that explains why. According to the PowerShell quoting rules at About Quoting Rules, this shouldn't be necessary.

BEGIN SOLUTION

The punchline, courtesy of Duncan below, is that you should add this function to your PowerShell profile:

 filter Run-Native($command) { $_ | & $command ($args -replace'(\\*)"','$1$1\"') }

Or specifically for AWK:

 filter awk { $_ | gawk.exe ($args -replace'(\\*)"','$1$1\"') }

END SOLUTION

The quotes are properly passed to PowerShell's echo:

PS> echo '"hello"'
"hello"    <-- GOOD

But when calling out to an external "native" program, the quotes disappear:

PS> c:\cygwin\bin\echo.exe '"hello"'
hello    <-- BAD, POWERSHELL REMOVED THE QUOTES

Here's an even cleaner example, in case you're concerned that Cygwin might have something to do with this:

echo @"
>>> // program guaranteed not to interfere with command line parsing
>>> public class Program
>>> {
>>>    public static void Main(string[] args)
>>>    {
>>>       System.Console.WriteLine(args[0]);
>>>    }
>>> }
>>> "@ > Program.cs
csc.exe Program.cs
.\Program.exe '"hello"'
hello    <-- BAD, POWERSHELL REMOVED THE QUOTES

DEPRECATED EXAMPLE for passing to cmd, which does its own parsing (see Etan's comment below):

PS> cmd /c 'echo "hello"'
"hello"     <-- GOOD

DEPRECATED EXAMPLE for passing to Bash, which does its own parsing (see Etan's comment below):

PS> bash -c 'echo "hello"'
hello    <-- BAD, WHERE DID THE QUOTES GO

Any solutions, more elegant workarounds, or explanations?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
jhclark
  • 2,493
  • 1
  • 20
  • 14
  • `cmd /c` has complicated and bizarre quoting rules, I'm not sure that's a good test case. The bash test case is explained because the shell you are running processes the command normally (and thus performs variable expansion, quote removal, etc.). – Etan Reisner Jan 23 '14 at 17:49
  • Compare the output from `bash -xc 'echo "hello"'` to the output from `bash -c 'echo \"hello\"'` to see what I mean. – Etan Reisner Jan 23 '14 at 17:55
  • As a further data point try getting the inner quotes through `powershell -command `. I have not been able to do that at all. – Etan Reisner Jan 23 '14 at 18:00
  • Good call Etan, I've updated the examples above with cleaner test cases -- This helps place the blame squarely on PowerShell. – jhclark Jan 23 '14 at 18:02
  • Possible duplicate of [PowerShell stripping double quotes from command line arguments](http://stackoverflow.com/questions/6714165/powershell-stripping-double-quotes-from-command-line-arguments) – phuclv May 19 '17 at 06:00

3 Answers3

10

The problem here is that the Windows standard C runtime strips unescaped double quotes out of arguments when parsing the command line. PowerShell passes arguments to native commands by putting double quotes around the arguments, but it doesn't escape any double quotes that are contained in the arguments.

Here's a test program that prints out the arguments it was given using the C stdlib, the 'raw' command line from Windows, and the Windows command line processing (which seems to behave identically to the stdlib):

C:\Temp> type t.c
#include <stdio.h>
#include <windows.h>
#include <ShellAPI.h>

int main(int argc,char **argv){
    int i;
    for(i=0; i < argc; i++) {
        printf("Arg[%d]: %s\n", i, argv[i]);
    }

    LPWSTR *szArglist;
    LPWSTR cmdLine = GetCommandLineW();
    wprintf(L"Command Line: %s\n", cmdLine);
    int nArgs;

    szArglist = CommandLineToArgvW(GetCommandLineW(), &nArgs);
    if( NULL == szArglist )
    {
        wprintf(L"CommandLineToArgvW failed\n");
        return 0;
    }
    else for( i=0; i<nArgs; i++) printf("%d: %ws\n", i, szArglist[i]);

// Free memory allocated for CommandLineToArgvW arguments.

    LocalFree(szArglist);

    return 0;
}

C:\Temp>cl t.c "C:\Program Files (x86)\Windows Kits\8.1\lib\winv6.3\um\x86\shell32.lib"
Microsoft (R) C/C++ Optimizing Compiler Version 18.00.21005.1 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

t.c
Microsoft (R) Incremental Linker Version 12.00.21005.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:t.exe
t.obj
"C:\Program Files (x86)\Windows Kits\8.1\lib\winv6.3\um\x86\shell32.lib"

Running this in cmd we can see that all unescaped quotes are stripped, and spaces only separate arguments when there have been an even number of unescaped quotes:

C:\Temp>t "a"b" "\"escaped\""
Arg[0]: t
Arg[1]: ab "escaped"
Command Line: t  "a"b" "\"escaped\""
0: t
1: ab "escaped"
C:\Temp>t "a"b c"d e"
Arg[0]: t
Arg[1]: ab
Arg[2]: cd e
Command Line: t  "a"b c"d e"
0: t
1: ab
2: cd e

PowerShell behaves a bit differently:

C:\Temp>powershell
Windows PowerShell
Copyright (C) 2012 Microsoft Corporation. All rights reserved.

C:\Temp> .\t 'a"b'
Arg[0]: C:\Temp\t.exe
Arg[1]: ab
Command Line: "C:\Temp\t.exe"  a"b
0: C:\Temp\t.exe
1: ab
C:\Temp> $a = "string with `"double quotes`""
C:\Temp> $a
string with "double quotes"
C:\Temp> .\t $a nospaces
Arg[0]: C:\Temp\t.exe
Arg[1]: string with double
Arg[2]: quotes
Arg[3]: nospaces
Command Line: "C:\Temp\t.exe"  "string with "double quotes"" nospaces
0: C:\Temp\t.exe
1: string with double
2: quotes
3: nospaces

In PowerShell, any argument that contains spaces is enclosed in double quotes. Also the command itself gets quotes even when there aren't any spaces. Other arguments aren't quoted even if they include punctuation such as double quotes, and and I think this is a bug PowerShell doesn't escape any double quotes that appear inside the arguments.

In case you're wondering (I was), PowerShell doesn't even bother to quote arguments that contain newlines, but neither does the argument processing consider newlines as whitespace:

C:\Temp> $a = @"
>> a
>> b
>> "@
>>
C:\Temp> .\t $a
Arg[0]: C:\Temp\t.exe
Arg[1]: a
b
Command Line: "C:\Temp\t.exe"  a
b
0: C:\Temp\t.exe
1: a
b

The only option since PowerShell doesn't escape the quotes for you seems to be to do it yourself:

C:\Temp> .\t 'BEGIN {print "hello"}'.replace('"','\"')
Arg[0]: C:\Temp\t.exe
Arg[1]: BEGIN {print "hello"}
Command Line: "C:\Temp\t.exe"  "BEGIN {print \"hello\"}"
0: C:\Temp\t.exe
1: BEGIN {print "hello"}

To avoid doing that every time, you can define a simple function:

C:\Temp> function run-native($command) { & $command $args.replace('\','\\').replace('"','\"') }

C:\Temp> run-native .\t 'BEGIN {print "hello"}' 'And "another"'
Arg[0]: C:\Temp\t.exe
Arg[1]: BEGIN {print "hello"}
Arg[2]: And "another"
Command Line: "C:\Temp\t.exe"  "BEGIN {print \"hello\"}" "And \"another\""
0: C:\Temp\t.exe
1: BEGIN {print "hello"}
2: And "another"

N.B. You have to escape backslashes as well as double quotes otherwise this doesn't work (this doesn't work, see further edit below):

C:\Temp> run-native .\t 'BEGIN {print "hello"}' 'And \"another\"'
Arg[0]: C:\Temp\t.exe
Arg[1]: BEGIN {print "hello"}
Arg[2]: And \"another\"
Command Line: "C:\Temp\t.exe"  "B EGIN {print \"hello\"}" "And \\\"another\\\""
0: C:\Temp\t.exe
1: BEGIN {print "hello"}
2: And \"another\"

Another edit: Backslash and quote handling in the Microsoft universe is even weirder than I realised. Eventually I had to go and read the C stdlib sources to find out how they interpret backslashes and quotes:

/* Rules: 2N backslashes + " ==> N backslashes and begin/end quote
          2N+1 backslashes + " ==> N backslashes + literal "
           N backslashes ==> N backslashes */

So that means run-native should be:

function run-native($command) { & $command ($args -replace'(\\*)"','$1$1\"') }

and all backslashes and quotes will survive the command line processing. Or if you want to run a specific command:

filter awk() { $_ | awk.exe ($args -replace'(\\*)"','$1$1\"') }

(Updated following @jhclark's comment: it needs to be a filter to allow piping into stdin.)

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Duncan
  • 92,073
  • 11
  • 122
  • 156
  • 3
    This does explain a good bit of my confusion. Coming from the Linux world, I'm used to a kernel-level API call that actually knows about a proper argv array: http://man7.org/linux/man-pages/man3/exec.3.html -- But this isn't the case in Windows since the kernel API function for creating new processes accepts only a single command line string, forcing each language's basic libraries to handle this task: See http://msdn.microsoft.com/en-us/library/17w5ykft%28v=vs.85%29.aspx and http://msdn.microsoft.com/en-us/library/windows/desktop/ms682425(v=vs.85).aspx – jhclark Jan 24 '14 at 22:46
  • 1
    @jhclark I've updated my answer, it turns out that command line processing is trickier than just escaping quote and backslash. – Duncan Jan 28 '14 at 14:11
  • 1
    One quick update, as a special case for awk, which is a line-by-line text processing tool, I needed to pipe input into awk incrementally, so I prefer to use a filter: filter awk { $_ | gawk.exe ($args -replace'(\\*)"','$1$1\"') } – jhclark Jan 28 '14 at 17:35
  • hello @Duncan . I've devoured your post here however am still hitting similar problems. There's a simple repro here: http://stackoverflow.com/questions/34922095/plink-strips-out-double-quote-marks, I'm hoping you or someone reading this wouldn't mind taking a look. – jamiet Jan 21 '16 at 12:54
2

You get different behavior, because you're using 4 different echo commands, and in different ways on top of that.

PS> echo '"hello"'
"hello"

echo is PowerShell's Write-Output cmdlet.

This works, because the cmdlet takes the given argument string (the text within the outer set of quotes, i.e. "hello") and prints that string to the success output stream.

PS> c:\cygwin\bin\echo '"hello"'
hello

echo is Cygwin's echo.exe.

This doesn't work, because the double quotes are removed from the argument string (the text within the outer set of quotes, i.e. "hello") when PowerShell calls the external command.

You get the same result if for instance you call echo.vbs '"hello"' with WScript.Echo WScript.Arguments(0) being the content of echo.vbs.

PS> cmd /c 'echo "hello"'
"hello"

echo is CMD's built-in echo command.

This works, because the command string (the text within the outer set of quotes, i.e. echo "hello") is run in CMD, and the built-in echo command preserves the argument's double quotes (running echo "hello" in CMD produces "hello").

PS> bash -c 'echo "hello"'
hello

echo is bash's built-in echo command.

This doesn't work, because the command string (the text within the outer set of quotes, i.e. echo "hello") is run in bash.exe, and its built-in echo command does not preserve the argument's double quotes (running echo "hello" in bash produces hello).

If you want Cygwin's echo to print outer double quotes you need to add an escaped pair of double quotes to your string:

PS> c:\cygwin\bin\echo '"\"hello\""'
"hello"

I would've expected this to work for the bash-builtin echo es well, but for some reason it doesn't:

PS> bash -c 'echo "\"hello\""'
hello
Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
  • 1
    The bash example works correctly for me from bash on a linux system. It also works from cmd.exe. So I'm going to chalk that up to another bit of quote removal oddness in powershell. – Etan Reisner Jan 24 '14 at 16:50
1

Quoting rules can get confusing when you're calling commands directly from PowerShell. Instead, I regularly recommend that people use the Start-Process cmdlet, along with its -ArgumentList parameter.

Start-Process -Wait -FilePath awk.exe -ArgumentList 'BEING {print "Hello"}' -RedirectStandardOutput ('{0}\awk.log' -f $env:USERPROFILE);

I don't have awk.exe (does that come from Cygwin?), but that line should work for you.

  • 1
    That runs the process in another window, so it doesn't work if you want to see the output in the Powershell window. Even with the `-nonewwindow` option it still suffers from the same failure to automatically escape quotes in arguments. – Duncan Jan 24 '14 at 13:49
  • Actually, since it includes the `-RedirectStandardOutput` parameter, all a simple call to `Get-Content` will retrieve the command's output. –  Jan 24 '14 at 14:24