3

On my website, users will be able to input html tags for the content so the text can be bold, italic or links and image. I plan to use ckeditor or tinymce which are really using HTML tags (not BBC code or wiki syntax) If I allow HTML, when the text will be shown it will be interpreted and it may contain some "hack" like javascript or XSS.... How can I do to avoid this security issue ? Do I have to list the wanted html tags and to delete all unwanted tags and content ? Can I use strip tags for this ?

How is it done on stackoverflow for example ?

Do you know some plugin php/jquery plugins who can safely save and safely interpret limited html tags ?

Thanks in advance for your help

  • I think you can configure which elements are allowed in the CKEditor. -- http://stackoverflow.com/questions/2912805/how-to-define-allowed-tags-in-ckeditor – Smamatti Sep 13 '12 at 12:46
  • The most simple fix would be to just run an `str_replace(' – user1477388 Sep 13 '12 at 12:48
  • Take a look at the HtmlSanitizer that is part of the [Microsoft Web Protection Library](http://wpl.codeplex.com). – Steven Sep 13 '12 at 12:57

1 Answers1

5

You need to use both a server side HTML sanitizer, and a Content Security Policy preventing in-line scripts, eval and remotely hosted scripts

Depending on what language you are using server side, use HtmlSanitiser or python Bleach.

using either client side validation or naive filtering will not protect you at all:

  1. client side validation, as suggested by @Smamatti will not help you if a user submits the form manually.
  2. naive filtering such as str_replace('<script>', '', $str); suggested by @user1477388 will not protect you when someone uploads <script src="foo"> or <<script>script>alert('foo');</script> or <body onload="alert('foo')";</body>
Thomas Grainger
  • 2,271
  • 27
  • 34
  • 1
    I totally agree with Thomas. There are so many ways to encode malicious code that it practically impossible to come up with a safe method that does (black list) filteren. For instance, take a look at this [XSS Cheat Sheet](http://ha.ckers.org/xss.html) to get an idea of what crazy ways there are to inject scripts. It's quite scary actually. For instance, try detecting this piece of code as malicious: ``, or this one: ``. – Steven Sep 13 '12 at 13:25
  • 1
    Thanks to everybody for your answer. In fact, ckeditor is escaping the html tags or special characters which are not allowed So when you select a tag for example it is send as is to your form But if you type a < or the "<" are escaped. It seems that those guys thinked about this security issue Now my problem, is that to show the text in a correct form, I need to do a htmlspecialchars_decode... So if somebody just sends to my php treatment page some bad code, I need to also manage it at this point security problems are really a pain.... – user1496486 Sep 13 '12 at 14:53
  • @user: As Thomas said in (1), CKeditor is a client-side tool. You can't trust any validation measure that happens on the client side. An attacker could alter or bypass CKeditor and submit dangerous HTML. You *must* clean HTML at the server-side. If you are using PHP the usual tool is [HTML Purifier](http://htmlpurifier.org/). FWIW personally for just bold/italic/image/links I'd use a smaller, friendlier, less complex language than HTML. – bobince Sep 13 '12 at 23:23
  • @bobince Thanks for HTML purifier. What will you use for only those tags. BBCode ? If you know a good and easy to use tool it will be nice – user1496486 Sep 14 '12 at 09:53
  • @user1496486 stick with HTML if you're using CKEditor, just make sure you have parity with your client side and server side validation. – Thomas Grainger Sep 16 '12 at 17:24