19

Here's a barebones Python app that simply prints the command-line arguments passed in:

import sys
if __name__ == "__main__":
    print "Arguments:"
    for i in range(len(sys.argv)):
        print "[%s] = %s" % (i, sys.argv[i])

And here's some sample runs:

python args.py hello world
Arguments:
[0] = args.py
[1] = hello
[2] = world

python args.py "hello world"
Arguments:
[0] = args.py
[1] = hello world

python args.py "hello\world"
Arguments:
[0] = args.py
[1] = hello\world

So far so good. But now when I end any argument with a backslash, Python chokes on it:

python args.py "hello\world\"
Arguments:
[0] = args.py
[1] = hello\world"

python args.py "hello\" world "any cpu"
Arguments:
[0] = args.py
[1] = hello" world any
[2] = cpu

I'm aware of Python's less-than-ideal raw string behavior via the "r" prefix (link), and it seems clear that it's applying the same behavior here.

But in this case, I don't have control of what arguments are passed to me, and I can't enforce that the arguments don't end in a backslash. How can I work around this frustrating limitation?

--

Edit: Thanks to those who pointed out that this behavior isn't specific to Python. It seems to be standard shell behavior (at least on Windows, I don't have a Mac at the moment).

Updated question: How can I accept args ending in a backslash? For example, one of the arguments to my app is a file path. I can't enforce that the client sends it to me without a trailing backslash, or with the backslash escaped. Is this possible in any way?

Community
  • 1
  • 1
Aseem Kishore
  • 10,404
  • 10
  • 51
  • 56
  • What does the standard shell do with a trailing backslash? What does a standard command like "echo" or "ls" do with a trailing backslash? – S.Lott Aug 18 '09 at 01:02
  • Thanks S Lott, you're right -- when I make a .NET console app to do the same thing, I get the same behavior. So it's not specific to Python. – Aseem Kishore Aug 18 '09 at 01:09
  • Don't put your edit at the beginning because the first words of your questions are the summary in the main questions listing : your editing comment isn't a good summary. – Philippe Carriere Aug 18 '09 at 01:31

8 Answers8

14

That's likely the shell treating \ as an escape character, and thus escaping the character. So the shell sends \" as " (because it thinks you are trying to escape the double quote). The solution is to escape the escape character, like so: $ python args.py "hello\world\\".

mipadi
  • 398,885
  • 90
  • 523
  • 479
  • Up. Plus, it is not python that chokes, its the command-line. Its waiting for the ending ", sinse you have escaped the " you taped. – Havenard Aug 18 '09 at 01:11
  • 3
    Alternatively, use single quotes: bash doesn't do escapes inside singly-quoted strings, so `$ python args.py 'hello world\'` behaves as expected. – Adam Rosenfield Aug 18 '09 at 01:14
  • Thanks mipadi. Btw, I noticed you only escaped the second backslash, but that seems to be correct. – Aseem Kishore Aug 18 '09 at 01:15
  • Adam, that's a brilliant tip (using single quotes instead of double). It works on Windows too. I'll keep that in mind for the future. – Aseem Kishore Aug 18 '09 at 01:17
  • @ Aseem: I just copied one of the examples you listed in the original post. – mipadi Aug 18 '09 at 01:18
  • OP says that he can't control the arguments passed to his program. So any answer that depends on changing the arguments isn't helpful. – Schof May 19 '10 at 01:21
9

The Microsoft Parameter Parsing Rules

These are the rules for parsing a command line passed by CreateProcess() to a program written in C/C++:

  1. Parameters are always separated by a space or tab (multiple spaces/tabs OK)
  2. If the parameter does not contain any spaces, tabs, or double quotes, then all the characters in the parameter are accepted as is (there is no need to enclose the parameter in double quotes).
  3. Enclose spaces and tabs in a double quoted part
  4. A double quoted part can be anywhere within a parameter
  5. 2n backslashes followed by a " produce n backslashes + start/end double quoted part
  6. 2n+1 backslashes followed by a " produce n backslashes + a literal quotation mark
  7. n backslashes not followed by a quotation mark produce n backslashes
  8. If a closing " is followed immediately by another ", the 2nd " is accepted literally and added to the parameter (This is the undocumented rule.)

For a detailed and clear description see http://www.daviddeley.com/autohotkey/parameters/parameters.htm#WINCRULESDOC

cod3monk3y
  • 9,508
  • 6
  • 39
  • 54
Charlie36
  • 91
  • 1
  • 1
  • There is only an empty document at that link, can you provide another link to these parsing rules? – cod3monk3y Nov 20 '13 at 20:09
  • It looks like he moved his website to daviddeley.com. The exact text from this answer is included there, so I believe my assumption to be accurate, and I've updated the link accordingly. These are very interesting details. Thanks for posting! – cod3monk3y Nov 20 '13 at 20:15
8

The backslash at the end is interpreted as the start of an escape sequence, in this case a literal double quote character. I had a similar problem with handling environment parameters containing a path that sometimes ended with a \ and sometimes didn't.
The solution I came up with was to always insert a space at the end of the path string when calling the executable. My executable then uses the directory path with the slash and a space at the end, which gets ignored. You could possibly trim the path within the program if it causes you issues.

If %SlashPath% = "hello\"

python args.py "%SlashPath% " world "any cpu"
Arguments:
[0] = args.py
[1] = hello\ 
[2] = world
[3] = any cpu

If %SlashPath% = "hello"

python args.py "%SlashPath% " world "any cpu"
Arguments:
[0] = args.py
[1] = hello 
[2] = world
[3] = any cpu

Hopefully this will give you some ideas of how to get around your problem.

Anthony K
  • 2,543
  • 4
  • 32
  • 41
  • This issue is totally irrelevant to me now, but I love your solution of adding a space and will keep it in mind for the future! Thanks. =) – Aseem Kishore Mar 29 '11 at 02:38
2

The backslash 'escapes' the character following it. This means that the closing quotation marks become a part of the argument, and don't actually terminate the string.

This is the behaviour of the shell you're using (presumably bash or similar), not Python (although you can escape characters within Python strings, too).

The solution is to escape the backslashes:

python args.py "hello\world\\"

Your Python script should then function as you expect it to.

harto
  • 89,823
  • 9
  • 47
  • 61
  • Well, the first backslash apparently isn't supposed to be escaped (see mipadi's post above, he doesn't escape the first one), but thanks for pointing out that it's not Python-specific. – Aseem Kishore Aug 18 '09 at 01:16
  • I suppose the first double-backslash isn't strictly required in this case, but that's only because `\w` doesn't mean anything to the shell. If the first slash were followed by an `n` or `t`, you'd end up with unwanted whitespace in the string. I'd just escape both slashes to be a little more defensive. – harto Aug 18 '09 at 02:40
  • No, I mean that it actually gives an incorrect input. python args.py "hello\\world\\" Arguments: [0] = args.py [1] = hello\\world\ – Aseem Kishore Aug 18 '09 at 05:45
1

The backslash (\) is escaping the ". That's all. That is how it is supposed to work.

Esteban Küber
  • 36,388
  • 15
  • 79
  • 97
1

If this is on Windows, then you are not using a standard Windows command prompt (or shell). This must be bash doing this. The Windows command prompt doesn't treat backslash as an escape character (since it's the file path separator).

Extra trivia point: the quoting character in Windows command prompts is caret: ^

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • 1
    But it is on Windows. I'm seeing this behavior both with a Python console app and a C# .NET console app. – Aseem Kishore Aug 18 '09 at 05:43
  • But what shell are you using? You can use bash as your shell on Windows. I'm not saying it isn't happening, just wanted to clarify that your choice of shell matters. – Ned Batchelder Aug 18 '09 at 10:38
0

When the user passes your function a string "hello\", regardless of what their intention was, they sent the actual string hello", just like if a user passed a filepath like "temp\table" what they have really typed, intentionally or not, is "temp able" (tab in the middle).

This being said, a solution to this problem means that if a user inputs "temp\table" and honestly means "temp able", you are going to process this into "temp\table" and now you've programmatically destroyed the users input.

With this warning in mind, if you still want to do this, you can look for the string representation of these escaped-characters and replace them. As a really easy example, something like this:

def allow_tabs(str_w_tab):
    str_w_tab.replace('\t','\\t')
    print str_w_tab

Now if you want to handle all the other escape characters, you'll have to do something similar for each one. As for being able to do this for the example: "hello\", the user passed you the string hello", and whether they intended to or not, they never closed the double-quote, so this is what your program sees.

0

On 'nix based systems, this is a fundamental shell limitation, as others have said here. So, just suck it up. That said, it's really not that important because you don't often need backslashes in arguments on those platforms.

On Windows, however, backslashes are of critical value! A path ending in one would explicitly denote a directory vs a file. I have seen the documentation for MS C (see: https://learn.microsoft.com/en-us/previous-versions/17w5ykft(v=vs.85) ), and within the Python source (e.g. in subprocess.list2cmd https://github.com/python/cpython/blob/master/Lib/subprocess.py), explaining this problem with quoting a process argument and have it not able to end with a backslash. So, I forgive the Python developers for keeping the logic the same - but not the MS C ones! This is not a cmd.exe shell issue or a universal limitation for arguments in Windows! (The caret ^ is the equivalent escape character in that natural shell.)

Batch Example (test.bat):

@echo off
echo 0: %0 
echo 1: %1 
echo 2: %2
echo 3: %3 

Now execute it (via cmd.exe):

test.bat -t "C:\test\this path\" -v

Yields:

0: test.bat
1: -t
2: "C:\test\this path\"
3: -v

As you can see - a simple batch file implicitly understands what we want!

But... let's see what happens in Python, when using the standard argparse module (https://docs.python.org/3/library/argparse.html), which is intertwined with sys.argv initial parsing by default:

broken_args.py

import os
import argparse # pip install argparse

parser = argparse.ArgumentParser( epilog="DEMO HELP EPILOG" ) 
parser.add_argument( '-v', '--verbose', default=False, action='store_true', 
                     help='enable verbose output' )
parser.add_argument( '-t', '--target', default=None,
                     help='target directory' )                           
args = parser.parse_args()                       
print( "verbose: %s" % (args.verbose,) )
print( "target: %s" % (os.path.normpath( args.target ),) )

Test that:

python broken_args.py -t "C:\test\this path\" -v

Yields these bad results:

verbose: False
target: C:\test\this path" -v

And so, here's how I solved this. The key "trick" is first fetching the full, raw command line for the process via the Windows api:

fixed_args.py

import sys, os, shlex
import argparse # pip install argparse

IS_WINDOWS = sys.platform.startswith( 'win' )
IS_FROZEN  = getattr( sys, 'frozen', False )
    
class CustomArgumentParser( argparse.ArgumentParser ):
    if IS_WINDOWS:
        # override
        def parse_args( self ):
            def rawCommandLine():
                from ctypes.wintypes import LPWSTR
                from ctypes import windll
                Kernel32 = windll.Kernel32
                GetCommandLineW = Kernel32.GetCommandLineW
                GetCommandLineW.argtypes = ()
                GetCommandLineW.restype  = LPWSTR
                return GetCommandLineW()                            
            NIX_PATH_SEP = '/'                
            commandLine = rawCommandLine().replace( os.sep, NIX_PATH_SEP )
            skipArgCount = 1 if IS_FROZEN else 2
            args = shlex.split( commandLine )[skipArgCount:]        
            return argparse.ArgumentParser.parse_args( self, args )
 
parser = CustomArgumentParser( epilog="DEMO HELP EPILOG" ) 
parser.add_argument( '-v', '--verbose', default=False, action='store_true', 
                     help='enable verbose output' )
parser.add_argument( '-t', '--target', default=None,
                     help='target directory' )                           
args = parser.parse_args()                       
print( "verbose: %s" % (args.verbose,) )
print( "target: %s" % (os.path.normpath( args.target ),) )

Confirm the fix:

python fixed_args.py -t "C:\test\this path\" -v

Yields these good results:

verbose: True
target: C:\test\this path
BuvinJ
  • 10,221
  • 5
  • 83
  • 96