I'm trying to figure out what the the minimum amount of encoding would be that would protect a site from XSS.
I know for sure I'll need to encode < (<) and > (>) inside of tags, " (") and ' (') inside attributes.
Do I also need to encode & (&)? I was having trouble with double encoding when the user was saving data (because & would become &amp;). Are there any security vulnerabilities or downsides that would happy if I didn't encode the ampersands? This would mean they'd be able to input any HTML entities they wanted.
By HTML entities I specifically mean ampersand-prefixed sequences that correspond to entities (like © ™).
This question is language-agnostic (except for the HTML part, of course).
Edit: heh, stack-overflow lets me keep my html encoded entities :) That might be telling.