10

I am learning how to use simplejson to decode JSON file. But I suffered the "invalid \escape" error. Here is the code

import simplejson as json

def main():
    json.loads(r'{"test":"\x27"}')

if __name__ == '__main__':
    main()

And here is the error message

Traceback (most recent call last):
  File "hello_world.py", line 7, in <module>
    main()
  File "hello_world.py", line 4, in main
    json.loads(r'{"test":"\x27"}')
  File "C:\Users\zhangkai\python\simplejson\__init__.py", line 307, in loads
    return _default_decoder.decode(s)
  File "C:\Users\zhangkai\python\simplejson\decoder.py", line 335, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Users\zhangkai\python\simplejson\decoder.py", line 351, in raw_decode

    obj, end = self.scan_once(s, idx)
  File "C:\Users\zhangkai\python\simplejson\scanner.py", line 36, in _scan_once
    return parse_object((string, idx + 1), encoding, strict, _scan_once, object_
hook)
  File "C:\Users\zhangkai\python\simplejson\decoder.py", line 185, in JSONObject

    value, end = scan_once(s, end)
  File "C:\Users\zhangkai\python\simplejson\scanner.py", line 34, in _scan_once
    return parse_string(string, idx + 1, encoding, strict)
  File "C:\Users\zhangkai\python\simplejson\decoder.py", line 114, in py_scanstr
ing
    raise ValueError(errmsg(msg, s, end))
ValueError: Invalid \escape: 'x': line 1 column 10 (char 10)

I think json parser is supposed to recognize the escape. So I want to know what is wrong, and what should I do.

pyfunc
  • 65,343
  • 15
  • 148
  • 136
kkpattern
  • 948
  • 2
  • 7
  • 31
  • Related: Missing double escape in windows file path: [python - json reading error json.decoder.JSONDecodeError: Invalid \escape - Stack Overflow](https://stackoverflow.com/questions/44687525/json-reading-error-json-decoder-jsondecodeerror-invalid-escape), octal escape [python - Fixing invalid JSON escape - Stack Overflow](https://stackoverflow.com/questions/15198426/fixing-invalid-json-escape) – user202729 Feb 17 '21 at 11:56

3 Answers3

15

JSON has no hex escape (\xNN) like some languages (including JavaScript) and notations do, details here. It has a unicode escape, \uNNNN where NNNN is four hex digits, but no \x hex escape.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • Thanks. So if the JSON file has \x notation, I should convert it myself first? – kkpattern Nov 28 '10 at 08:55
  • 9
    @user308587: If the file has `\x` notation, it's not in JSON format. If you want to accept invalid JSON anyway, yes, you'd have to pre-process it yourself. Assuming you want to treat the `\x` the way JavaScript does, convert `\xNN` to `\u00NN` (e.g., `\x27` becomes `\u0027`). FWIW, how `\x` and `\u` are handled by JavaScript -- **not** JSON -- is covered by Section 7.8.4 of [the ECMAScript spec](http://www.ecma-international.org/publications/standards/Ecma-262.htm). But my read is it really is just a matter of changing the `x` to a `u` and adding the leading zeroes. Best, – T.J. Crowder Nov 28 '10 at 09:04
  • @T.J.Crowder Can you please elaborate `just a matter of changing the x to a u and adding the leading zeroes` ? How do I do with a character that is part of a big string? – Volatil3 Feb 12 '14 at 17:08
  • 1
    @Volatil3: Say you have raw JSON in a string, for instance: `str = '{"foo": "bar\\x23 testing 1 2 3 \\x23"}'` You can convert those to `\u` notation with a simple replace: `str2 = str.replace(/\\x/g, "\\u00")` Then `str2` will successfully parse, and you'll have an object with a property, `foo`, with the value `"bar# testing 1 2 3 #"` (because `\x23` / `\u0023` is `#`). – T.J. Crowder Feb 12 '14 at 17:59
  • @Volatil3: Not `load`, `JSON.parse`. Assuming the text you're referring to is JSON. – T.J. Crowder Feb 12 '14 at 18:25
  • @T.J.Crowder there is no `json.parse` in Python 2.x – Volatil3 Feb 12 '14 at 18:28
  • @Volatil3: Ah, I didn't remember this question was originally about Python... My `replace` above may be suspect as well (I don't do Python, that was a JavaScript example); you'll have to massage it into the equivalent Python. – T.J. Crowder Feb 12 '14 at 18:34
  • It is nonsense that JSON has no `x` notation; JavaScript `eval` accepts it, so it is valid JavaScript format. – Vitalii Nov 06 '17 at 14:54
  • 1
    @Vitaliy: `eval` accepting it doesn't make it JSON. `eval` also accepts the string `(function foo() { alert("Not JSON"); })()` (http://jsfiddle.net/whghsf6o), but that's not JSON either. :-) While some actual JSON parsers (as opposed to `eval`) do accept `\xNN` notation (V8's, for instance), it is not valid JSON. Details in [the JSON website linked above](http://json.org/) as well as [the RFC](https://tools.ietf.org/html/rfc7159) and [the Standard](http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf) (pdf). `\xNN` in a string is valid *JavaScript*, but not valid *JSON*. – T.J. Crowder Nov 06 '17 at 15:32
5

This is expected behavior from a parser as that JSON is invalid; within a string a slash may be followed only by ", \, /, b, f, n, r, t or u (which must then be followed by 4 hex characters). An x is not allowed. See the spec at http://json.org/

Quentin
  • 914,110
  • 126
  • 1,211
  • 1,335
0

try python-cjson

import cjson
s = cjson.encode({'abc':123,'def':'xyz'})
print 'json: %s - %s' % (type(s), s)
s = cjson.decode(s)
print '%s - %s' % (type(s), s)
andres101
  • 67
  • 3