When sanitizing user input to be shown on website for text: is escaping HTML entities enough for text or do I still need to DOMPurify it?
I am building a website where the user can enter 2 types of text:
- Text which is to be rendered as titles, display names, single links etc
- Text which is rendered as markdown. This would be the body of an article for example.
In order to sanitize these inputs and prevent XSS, here's the technique I am planning on using. Note that I am using different techniques for the 2 types. Here's the functions I rely on:
function cleanedText(unclean){
return escapeHTML(unclean.trim());
}
function cleanedMarkdown(unclean){
return DOMPurify.sanitize(marked.parse(unclean.trim()));
}
function escapeHTML(html) {
var escape = document.createElement('textarea');
escape.textContent = html;
return escape.innerHTML;
}
function unescapeHTML(html) {
var escape = document.createElement('textarea');
escape.innerHTML = html;
return escape.textContent;
}
For 1 (text for titles, display names etc), I use the cleanedText()
function.
For 2 (markdown), I use the cleanedMarkdown()
function.
As you can see, the cleanedMarkdown()
does more work - it parses the markdown first, then it sanitizes the resulting markdown HTML with DOMPurify
. It does not escape the HTML entities.
Whereas, the cleanedText()
only escapes the HTML. It does not DOMPurify
it because from my understanding, since the entities are escaped already, they can only get rendered as text and not be interpreted as HTML. The escaping
HTML entities code is from:
https://stackoverflow.com/a/9251169/1634905
The following:
let title = cleanedText(`123<a href='javascript:alert(1)'>I am a dolphin!</a>Billy <script>alert('Hello Bob!')</script> #hashtag this is awesome#sauce<SCRIPT SRC=http://xss.rocks/xss.js></SCRIPT>`);
console.log(title)
outputs:
123<a href='javascript:alert(1)'>I am a dolphin!</a>Billy <script>alert('Hello Bob!')</script> #hashtag this is awesome#sauce<SCRIPT SRC=http://xss.rocks/xss.js></SCRIPT>
which seems okay to me and also renders correctly as text in my website.
Is my understanding correct about not needing to DOMPurify type 1?