0

I am pretty new to XSS and HTML Purifier (researched for a few days). Yet i have been a programming and web-dev guy for many years. (Yes i know shame that i didnt come across XSS. i thought of stuffs similar. But just didnt research it in depth.)

AFAIK, attackers can load their evil external JS in places like IMG's SRC, and in other valid tags' attributes as well. So i come upon an idea that, if i prohibit user's html to load resources outside my domain, (and purifier what i already have in my document/database,) can i say my site is free from XSS attacks?

Let me rephrase and structure my queries.

First, i am going to build a website, that allows users to input (directly or through upload) html codes. Quite typical.

I will use HTML Purifier to 'clear' the user codes.

The first question: (Q1) Even after using HTML Purifier, attackers can still load their evil scripts via valid html attributes. Is this true?

And (Q2) I suppose i cannot allow the <script> tag in the HTML Purifier setting, as any evil things can happen in the JS within the <script> tag. Is it true?

(Q3) Can HTML Purifier strip out all links, in anywhere of the text, that are not referring to the domains i trusted?

And finally, a theoretical issue (Q4) If the text has been HTML Purified, and no external links, can we say that it is absolutely free from XSS?

P.S. one more thing is that, i would like to allow certain (very limited) JS. Do you think it is ok to convert (my custom) tags like [ajax:getUserName] into real JS, in the final process?

Thanks very much!

midnite
  • 5,157
  • 7
  • 38
  • 52

1 Answers1

1

Let's assume for a moment that HTML Purifier has no security vulnerabilities (generally, it's a bad bet to assume software is not buggy, so beware.)

Q1: If you use HTML Purifier as described by the documentation (use it to purify HTML, put the result of HTML Purifier only in HTML contexts, configure your character encoding properly), then attackers should not be able to load their scripts. It is "safe" out of the box.

Q2: HTML Purifier will not allow you to allow the >script<; it will reject it as unknown.

Q3: Unfortunately, HTML Purifier only currently directly supports blacklisting strings in host names (using %URI.HostBlacklist) and only allow local links (%URI.DisableExternal). But you could define a URI filter for a more complicated policy.

Q4: The no external links restriction is not necessary, it should be free of XSS.

PS: That is OK, as long as you handle escaping user input that is included into the JS properly.

Edward Z. Yang
  • 26,325
  • 16
  • 80
  • 110
  • +1 thanks very much for your answer. Re (Q1) i read some posts saying that if an attribute is linking to an external evil resource, HTML Purifier cannot purify the contents in it. This makes sense to me, and also worries me. Could you explain a bit further how HTML Purifier handle this situation? For (Q3), is `%URI.DisableExternal` the thing i m looking for? Many thanks!! – midnite Dec 04 '13 at 17:54
  • The "post" i mentioned would be, in http://htmlpurifier.org/phorum/read.php?3,6194 **MGH**'s post in October 06, 2013 04:17AM. And i believe i have read some stackoverflow questions about similar thing. – midnite Dec 05 '13 at 02:37
  • That is a different concern, which is related to the fact that HTML Purifier allows HTML which can cause the browser to make additional GET requests (e.g. images). You can disable this by setting %URI.DisableResources (but that will disable images) – Edward Z. Yang Dec 05 '13 at 02:51
  • Yes! This is my concern - additional GET requests, not only in img's src, but may also happen in other attributes. `%URI.DisableResources` would be too restrictive. Do you think that would be a big security hole? Do you think we should use `%URI.DisableExternal` (and allow linking to a script in my domain, check if the url is in a trusted domain, then redirect it, just like Google search results are all `http://www.google.com.hk/url?...`)? – midnite Dec 05 '13 at 03:07
  • Here are the two SOF threads that i have read about (just re-found them): http://stackoverflow.com/questions/4546591/xss-attacks-and-style-attributes and http://stackoverflow.com/questions/6976053/xss-which-html-tags-and-attributes-can-trigger-javascript-events?rq=1 – midnite Dec 05 '13 at 03:23
  • If you only want to disable resources to external websites, you can do %URI.DisableExternalResources – Edward Z. Yang Dec 05 '13 at 04:57
  • Thanks Edward. The question is, does external resources cause harm, even with HTML Purifier? – midnite Dec 05 '13 at 05:13