Updated bad terminology
I'm looking at JSoup and the OWASP Java HTML sanitizer project. I'm only interested in such a tool for the purposes of preventing XSS attacks by sanitizing user input passed to the API layer. The OWASP project says
"Passing 95+% of AntiSamy's unit tests plus many more."
But, it doesn't tell me where I can see these tests myself. What do these tests cover? More simply, I want to know why these said tools are defaulting to whitelist trust.
I'm sure there is a reason for their choosing whitelisting vs blacklisting. I want to disallow only known XSS unsafe tags like script
and attributes such as on*
. The blacklist approach does not even seem possible.
I need to know what the reasoning is for this and I suspect it's in the tests. For example, why disallow style
tags? Is it dangerous in terms of XSS or does it exist for some other reason? (style
can be XSS unsafe as mentioned in the comments: XSS attacks and style attributes)
I'm looking for more XSS unsafe justifications for other tags. The unit tests themselves should be enough if somebody knows where to find them. Given enough unsafe tags, this should tell me why a whitelist approach is necessary.