4

I'm having some trouble with escape characters and json.dumps.

It seems like extra escape characters are being added whenever json.dumps is called. Example:

not_encoded = {'data': '''!"#$%'()*+,-/:;=?@[\]^_`{|}~0000&<>'''}
print(not_encoded)
{'data': '!"#$%\'()*+,-/:;=?@[\\]^_`{|}~0000&<>'}

This is fine, but when I do a json dumps it adds a lot of extra values.

json.dumps(not_encoded)
'{"data": "!\\"#$%\'()*+,-/:;=?@[\\\\]^_`{|}~0000&<>"}'

The dump shouldn't look like this. It's double escaping the \ and the ". Anyone know why this is and how to fix it? I would want the json.dumps to output

'{"data": "!\"#$%'()*+,-/:;=?@[\\]^_`{|}~0000&<>"}'

edit

Loading back in the dump:

the_dump = json.dumps(not_encoded)
json.loads(the_dump)
{u'data': u'!"#$%\'()*+,-/:;=?@[\\]^_`{|}~0000&<>'}

The problem is I'm hitting an API endpoint which needs these special characters, but it goes over character limit when the json.dumps adds additional escape characters (\\\\ and \\").

Jordan
  • 1,969
  • 2
  • 18
  • 19

2 Answers2

6

It is worth reading up on the difference between print, str and repr in python (see here for example). You are comparing the printed original string with a repr of the json encoding, the latter will have double escapes - one from the json encoding and one from python's string representation.

But otherwise there is no issue, if you compare len(not_encoded['data']) with len(json.loads(json.dumps(not_encoded))['data']) you will find they are the same. There are no extra characters, but there are different methods of displaying them.

ChrisD
  • 927
  • 4
  • 10
  • I think the OP's issue is that the *JSON encoded data grows*, and as far as I can tell, this is impossible to prevent? – bohrax Aug 30 '18 at 16:34
  • I thought he was concerned about json.dumps 'double escaping the \ and "'. The second escape was from the call to repr, not the json encoding (as you point out it is required to escape those two special characters) – ChrisD Aug 30 '18 at 22:59
  • Ah, I see your point now. Perhaps that *is* what the question is about, but in that case it shouldn't affect what is actually sent over the wire. – bohrax Aug 31 '18 at 05:59
3

json.dumps is required to escape " and \ according to the JSON standard. If the API uses JSON you cannot avoid your data to grow in length when using these characters.

From json.org:

JSON string syntax

bohrax
  • 1,051
  • 8
  • 20