I am pretty new to XSS and HTML Purifier (researched for a few days). Yet i have been a programming and web-dev guy for many years. (Yes i know shame that i didnt come across XSS. i thought of stuffs similar. But just didnt research it in depth.)
AFAIK, attackers can load their evil external JS in places like IMG's SRC, and in other valid tags' attributes as well. So i come upon an idea that, if i prohibit user's html to load resources outside my domain, (and purifier what i already have in my document/database,) can i say my site is free from XSS attacks?
Let me rephrase and structure my queries.
First, i am going to build a website, that allows users to input (directly or through upload) html codes. Quite typical.
I will use HTML Purifier to 'clear' the user codes.
The first question: (Q1) Even after using HTML Purifier, attackers can still load their evil scripts via valid html attributes. Is this true?
And (Q2) I suppose i cannot allow the <script> tag in the HTML Purifier setting, as any evil things can happen in the JS within the <script> tag. Is it true?
(Q3) Can HTML Purifier strip out all links, in anywhere of the text, that are not referring to the domains i trusted?
And finally, a theoretical issue (Q4) If the text has been HTML Purified, and no external links, can we say that it is absolutely free from XSS?
P.S. one more thing is that, i would like to allow certain (very limited) JS. Do you think it is ok to convert (my custom) tags like [ajax:getUserName] into real JS, in the final process?
Thanks very much!