28

I have an application that is sending a JSON object (formatted with Prototype) to an ASP server. On the server, the Python 2.6 "json" module tries to loads() the JSON, but it's choking on some combination of backslashes. Observe:

>>> s
'{"FileExists": true, "Version": "4.3.2.1", "Path": "\\\\host\\dir\\file.exe"}'

>>> tmp = json.loads(s)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  {... blah blah blah...}
  File "C:\Python26\lib\json\decoder.py", line 155, in JSONString
    return scanstring(match.string, match.end(), encoding, strict)
  ValueError: Invalid \escape: line 1 column 58 (char 58)

>>> s[55:60]
u'ost\\d'

So column 58 is the escaped-backslash. I thought this WAS properly escaped! UNC is \\host\dir\file.exe, so I just doubled up on slashes. But apparently this is no good. Can someone assist? As a last resort I'm considering converting the \ to / and then back again, but this seems like a real hack to me.

Thanks in advance!

Chris
  • 1,421
  • 3
  • 18
  • 31
  • "So column 58 is the escaped-backslash." It is the **backslash**; "escaped" only is relevant in the context of the **representation of** the string that you get from trying `s` at the command prompt. (You would not see it by doing `print(s)`, for example.) So, the actual text contains a *single* backslash followed by a lowercase d, which is not valid *in the JSON data*. – Karl Knechtel Aug 04 '22 at 22:38

3 Answers3

33

The correct json is:

r'{"FileExists": true, "Version": "4.3.2.1", "Path": "\\\\host\\dir\\file.exe"}'

Note the letter r if you omit it you need to escape \ for Python too.

>>> import json
>>> d = json.loads(s)
>>> d.keys()
[u'FileExists', u'Path', u'Version']
>>> d.values()
[True, u'\\\\host\\dir\\file.exe', u'4.3.2.1']

Note the difference:

>>> repr(d[u'Path'])
"u'\\\\\\\\host\\\\dir\\\\file.exe'"
>>> str(d[u'Path'])
'\\\\host\\dir\\file.exe'
>>> print d[u'Path']
\\host\dir\file.exe

Python REPL prints by default the repr(obj) for an object obj:

>>> class A:
...   __str__ = lambda self: "str"
...   __repr__  = lambda self: "repr"
... 
>>> A()
repr
>>> print A()
str

Therefore your original s string is not properly escaped for JSON. It contains unescaped '\d' and '\f'. print s must show '\\d' otherwise it is not correct JSON.

NOTE: JSON string is a collection of zero or more Unicode characters, wrapped in double quotes, using backslash escapes (json.org). I've skipped encoding issues (namely, transformation from byte strings to unicode and vice versa) in the above examples.

jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • :) >>> s = r'{"FileExists": true, "Version": "4.3.2.1", "Path": "\\\\host\\dir\\file.exe"}' >>> json.loads(s) {u'FileExists': True, u'Path': u'\\\\host\\dir\\file.exe', u'Version': u'4.3.2.1'} – Chris Oct 01 '09 at 18:01
  • So what is r actually doing? How can I apply it to a string that's already stored as, say, "foo". Is it some kind of encoding? – Chris Oct 01 '09 at 18:03
  • 1
    @Chris: `r''` is convenient to write Windows paths and regexps (you don't need to escape backslash in such literal strings). `r''` only matters how you write literals. It has no meaning for string objects. – jfs Oct 01 '09 at 18:08
  • Ok. What was happening was that I was making up this test data and using them in the python shell for "rehearsing" what I wanted my application to do. I guess I was overescaping, because once I said, "eff this, let's just try it live", it worked! Thanks for the comments that led me to a better understanding. – Chris Oct 01 '09 at 18:20
  • @Chris If you don't want to have to escape those ugly backslashes in Windows file paths, you can always do `json.loads(s.replace('\\', '\\\\'))`. This way, any backslashes are automatically escaped. **NOTE** this will double escape any valid escape sequences! So don't use it when you're already escaping.... – Chris Collett Jul 06 '21 at 19:30
  • 1
    @ChrisCollett: no, it won't help: `'\f'.replace('\\', '\\\\') == '\f'` (there is no actual backslash in the string '\f' is a single char (`'\u000c' == '\f'`)). – jfs Jul 07 '21 at 15:58
19

Since the exception gives you the index of the offending escape character, this little hack I developed might be nice :)

def fix_JSON(json_message=None):
    result = None
    try:        
        result = json.loads(json_message)
    except Exception as e:      
        # Find the offending character index:
        idx_to_replace = int(str(e).split(' ')[-1].replace(')', ''))        
        # Remove the offending character:
        json_message = list(json_message)
        json_message[idx_to_replace] = ' '
        new_message = ''.join(json_message)     
        return fix_JSON(json_message=new_message)
    return result
Mihir Mehta
  • 13,743
  • 3
  • 64
  • 88
Blairg23
  • 11,334
  • 6
  • 72
  • 72
  • 11
    Thanks for the code. In my case (Python 3.5) I was needed to change `idx_to_replace = int(e.message.split(' ')[-1].replace(')',''))` to `idx_to_replace = int(str(e).split(' ')[-1].replace(')', ''))` – TitanFighter Sep 25 '16 at 19:49
0
>>> s
'{"FileExists": true, "Version": "4.3.2.1", "Path": "\\\\host\\dir\\file.exe"}'
>>> print s
{"FileExists": true, "Version": "4.3.2.1", "Path": "\\host\dir\file.exe"}

You've not actually escaped the string, so it's trying to parse invalid escape codes like \d or \f. Consider using a well-tested JSON encoder, such as json2.js.

John Millikin
  • 197,344
  • 39
  • 212
  • 226