5

Working with "C:\Program Files (x86)" I ran into a strange issue with a program located somewhere below that path. I reproduced the behaviour with a test program.

    int _tmain(int argc, _TCHAR* argv[])
{
    wprintf(L"%d\n", argc);
    for (int i = 0; i < argc; i++) {
        wprintf(L"%s\n", argv[i]);
    }
    return 0;
}

The program counts and returns all command line arguments (including the path to program used to identify the program). I named it "HelloWorld.exe" because I was in a hurry.

For three possible ways to run the program it gives two different results, whereas I was expecting the same result.

When I run HelloWorld.exe from its own directory, the output is

1
HelloWorld.exe

That output is correct and expected.

When I run HelloWorld.exe, which is located in "P:\Test (x86)", from another location and use the quoted path, the output is

1
P:\Test (x86)\HelloWorld.exe

That output is also correct and expected.

However, when I run HelloWorld.exe from another location and use a path with escaped spaces and brackets, the program is found (i.e. the path is correct), but the output is wrong:

2
P:\Test
(x86)\HelloWorld.exe

For some reason the escaped space in

P:\Test\^ ^(x86^)\HelloWorld.exe

becomes a space-read-as-operator for some reason and Windows, after reading the path as one string to find the program, decides that it is really two strings after all before creating that array the program then refers to.

This behaviour occurs in both Windows XP (x86) and Windows Server 2008 R2 (x64). I assume it is present in all (NT) versions of Windows.

Andrew J. Brehm
  • 4,448
  • 8
  • 45
  • 70

1 Answers1

5

Update:

Oops. Maybe it is a bug (or perhaps the term is misfeature) in Windows.

I just made a quick little test program that simply calls GetCommandLine() and prints that out to the console.

I called it with:

test The^ rain^ in^ Spain^ falls^ mainly^ on^ the^ plain^ ^(or^ so^ they^ say^).

And this is the output:

test  The rain in Spain falls mainly on the plain (or so they say).

So I guess the runtime library never sees the carets at all, and your only option is to tell your users to use quotes instead of escapes.


No, it's not a bug in Windows. It's a bug (although I might prefer the term shortcoming or deficiency in this case) in your C runtime library.

Windows is processing the escape characters and locating your executable. But it's not what separates the command line into arguments. Windows is not what calls your main function (or _tmain in this case). It simply starts your process at the entry point defined within the PE header. At this location is some C library code (or a dynamic call into it) that, among other startup tasks, calls the kernel32 function GetCommandLine(), then splits that on spaces, honoring quotes, but not, apparently, caret escapes.

It's not very surprising, really. I don't think most people know that you can escape characters with carets on the Windows command line. I certainly didn't.

If this is causing a real-world issue for you, where somebody is actually using caret escapes to invoke your program, you can either tell them to stop, or write your own command-line parsing routine, passing to it the output of GetCommandLine(), and ignore what you get passed to you in main.

P Daddy
  • 28,912
  • 9
  • 68
  • 92
  • My C runtime library is whatever Visual Studio uses. Wouldn't that be Windows' C runtime library? – Andrew J. Brehm Aug 24 '12 at 14:32
  • Looks to me like Microsoft's C runtime library, when it doesn't recognise the Windows escape character, is not Windows-compatible. :-) – Andrew J. Brehm Aug 24 '12 at 14:33
  • In my outputs the carets are also never visible. – Andrew J. Brehm Aug 24 '12 at 19:11
  • Note that your output (if printed here correctly) suggests that "test" and the rest are seen as two distinct strings, i.e. the carets did their job and escaped the spaces. – Andrew J. Brehm Aug 24 '12 at 19:14
  • @Andrew: I don't see it that way. There's nothing to say that "test" and "The rain in Spain..." are two distinct strings. Note that the result of `GetCommandLine()` is *exactly the same* whether the carets were used or not. There is absolutely nothing to tell the argument-splitting code that argv[1] begins at "The" and ends at "say).". – P Daddy Aug 24 '12 at 23:56
  • Change your code to insert newlines between strings? With the two spaces between the first and second string (the first being the path to the program) and the one space at all other locations, this is what I would guess. – Andrew J. Brehm Aug 25 '12 at 00:24
  • Obviously the carats aren't passed to the target program, they're processed (and removed) by cmd.exe. That's as-expected. – Harry Johnston Aug 25 '12 at 00:36
  • 1
    Arguably this is a bug in the command processor; it shouldn't be accepting carat-escaped-spaces as a way of specifying the path to an executable. The workaround is straightforward, of course - don't use carat-escaped-spaces on the command line. – Harry Johnston Aug 25 '12 at 00:45
  • @Andrew: There aren't two strings. There's just one. `"test The rain in Spain..."` That's the return value from `GetCommandLine()`. You can test with the simplest of programs: `int main(){printf("%s", GetCommandLine());}` Call it with and without carets. The output is the same. The extra space inserted between the executable name and the rest of the command line *is* an oddity, but it's not what you're supposing it to be. – P Daddy Aug 25 '12 at 00:56
  • Oddly, using carat-escaped-spaces in this way only works if you're specifying the full path, including the drive letter. Otherwise, the carat-escaped-space is treated exactly as if it were a normal space. – Harry Johnston Aug 25 '12 at 01:19