4

Assume some PHP code which echoes an input sanitized by first applying addslashes() and then htmlspecialchars() to an HTML document. I have heard that this is an unsafe approach, but cannot figure out why.

Any suggestions as to what sort of formatting could be applied to a dangerous input, such as JavaScript in script tags, to bypass the security measures imposed by the two functions would be appreciated.

halfer
  • 19,824
  • 17
  • 99
  • 186
  • ... or use a library to sanitise your output, like : http://htmlpurifier.org/ - much less of a headache than an RYO approach (especially when you're dealing with a language other than English, with the UTF-8 charset, and you want ß rather than `ß`); it ensures valid markup as well. – CD001 Dec 02 '14 at 13:07

1 Answers1

3

addslashes is irrelevant to XSS (and there is almost always something better in places where it is actually useful).

htmlspecialchars is not an unsafe approach. It is just insufficient by itself.

htmlspecialchars will protect you if you put the content as the body of a "safe" element.

It will protect you if you put the content as the value of a "safe" attribute if you also properly quote the value.

It won't protect you if you put it as the value of an "unsafe" attribute or element (where the content may be treated as JavaScript) such as <script>, onmoseover, href or style.


For example:

<!-- http://example.com/my.php?message=", steal_your_cookies(), " -->
<!-- URL not encoded for clarity. Imagine the definition of steel_your_cookies was there too -->

<button onclick='alert("<?php echo htmlspecialchars($_GET['message']); ?>")'>
   click me
</button>

will give you:

<button onclick='alert("&quot;, steal_your_cookies(), &quot;")'>
   click me
</button>

which means the same as:

<button onclick='alert("", steal_your_cookies(), "")'>
   click me
</button>

which will steal your cookies when you click the button.

Quentin
  • 914,110
  • 126
  • 1,211
  • 1,335
  • I thought it was fine to insert variables inside "unsafe" elements and attributes if the value is escaped using htmlspecialchars AND quoted properly. The escaping removes the quote characters so you need to surround the content again. E.g. var a = ''; – Phil Dec 02 '14 at 13:33
  • @Phil_1984_ — No. It isn't remotely safe. Escaping doesn't remove quote characters. It escapes them so they get treated as content instead of HTML. That means that in an unsafe context (such as `onclick='alert("");'`, the user can type a `'` and have it treated as a JavaScript `'` instead of an HTML `'`. This lets them put whatever JS they like in there. That is not safe! – Quentin Dec 02 '14 at 13:47
  • @Phil_1984_ — Using `ENT_QUOTES` won't make a difference (for this example) since there are no `'` in the data being escaped. – Quentin Dec 02 '14 at 14:16
  • Great examples. Yes you are totally correct of course. My first comment is wrong. Would I need to also do addslashes too since it's in the javascript? I guess it is because htmlspecialchars is for escaping for html and addSlashes for escaping for javascript? – Phil Dec 04 '14 at 00:25
  • `json_encode` is a far better choice for escaping data for JS. – Quentin Dec 04 '14 at 07:09