1

I have a python server-side application that generates a simple HTML page with a big blurb of client-side javascript that generates client-side the DOM tree displayed to the user based on a big blob of JSON data assigned to a js variable. Some of that JSON data contains strings, some of which contain HTML tags. It all boils down to something like this:

<html>
...
var tmp = "<p>some text</p>";
...
</html>

Unsurprisingly, the above does not work since it should look like the following to make the browser HTML parser happy:

<html>
...
var tmp = "<p>some text<\/p>";
...
</html>

(notice the escaped forward slash)

The JSON inserted in the HTML is generated with the python default json library. Namely, with json.dumps which is designed explicitely to not escape the forward slash in strings.

I tried to subclass json.JSONDecoder to override its behavior for python strings but this does not work since it does not allow specialization of the serialization of basic python types.

I tried to use a variety of other python json libraries without much luck: it seems that since most people hate the escaped forward slashes, most libraries do not generate them.

I could escape the strings by hand before stuffing them in my python data structures before calling json.dumps. I could also write a function to recursively iterate over the data structure, spot strings, and escape them automatically (nicer over the long run). I could maybe escape the string generated by json.dumps before stuffing it in the HTML (I am not sure that this could not lead to invalid JSON being inserted in the HTML).

Which leads me to my question: is there a json serialization library that can be coerced to escape forward slashes in strings in python ?

mathieu
  • 2,954
  • 2
  • 20
  • 31
  • I believe just a simple search&replace would be a valid workaround – John Dvorak Aug 21 '13 at 14:51
  • 2
    You shouldn't need to escape forward slashes because JavaScript should be in an HTML comment and the browser would therefore ignore it. Are you placing `` around your script? – kindall Aug 21 '13 at 14:57
  • ahhh. This appears to work. I found a couple of people who advise against it (http://stackoverflow.com/questions/808816/are-html-comments-inside-script-tags-a-best-practice) so I tried to break it by including various strings that contain "-->" without much luck. i.e., I could not break it :) Maybe you should create an answer so I can accept it – mathieu Aug 21 '13 at 15:08

1 Answers1

2

The best way I've found is to just do a replacement on the resulting string.

out = json.dumps(obj)
out = out.replace("/", "\\/")

Escaping forward slashes is optional within the JSON spec, and doing so ensures that you won't get bit by "</script>" attacks in the string.

user85461
  • 6,510
  • 2
  • 34
  • 40
  • I've created a library to protect against this and attacks from [weird unicode characters](http://timelessrepo.com/json-isnt-a-javascript-subset/): https://github.com/yourcelf/escapejson – user85461 Sep 09 '16 at 17:37