4

Say I copy some "malicious" input, like a DOM node with an event handler or other javascript

<img src="bunny.jpg" onload="alert('hi');">

If I copy to this to my clipboard and paste it in a contenteditable div, the event handler is cleanly stripped out.

<img src="/Users/tjhance/Desktop/bunny.jpg">

I can now manipulate this DOM node to my heart's content. So far as good.

On the other hand, say I want to hook the browser's paste event and handle the paste in my own way. I can get the clipboard data easily:

<div contenteditable="true" id="myContentEditableDiv"></div>

<script>

$('#myContentEditableDiv').on('paste', function(event) {
    console.log(event);
    var pastedHtml = event.originalEvent.clipboardData.getData('text/html');
    console.log(pastedHtml);
});

</script>

When I do the paste, I get the HTML

<meta charset='utf-8'><img src="/Users/tjhance/Desktop/bunny.jpg" onload="alert('hi');" style="color: rgb(0, 0, 0); font-family: Times; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: 1; word-spacing: 0px; -webkit-text-stroke-width: 0px;">

It is unsanitized and still has the event listener on it. I can't do anything with this string, as far as I know. I can't parse it into HTML using the browser, since then it would run the JavaScript, and that's a huge security vulnerability.

It's clear that the browser has some capability to cleanse HTML, since it does it when pasting. So if I want clean HTML, I could just wait for the event to go through and have the HTML added to the DOM. Of course, I wouldn't be posting here if I was OK with doing that...

So my question is, is there any way I can take potentially dirty HTML and get clean, safe DOM nodes to manipulate using the browser DOM api, without having the browser actually paste the HTML into the contenteditable div (which the user can see)? What are my options here?

tjhance
  • 961
  • 1
  • 7
  • 14

1 Answers1

1

You could use this hacky technique from olden days before all browsers supported getting clipboard data, although it's not very good. The biggest drawback is that it's only good for pasting via the keyboard.

The other alternative is to sanitize the HTML string yourself. The options which occurred to me as a starting point are DOMParser and document.implementation.createHTMLDocument. I'm not sure how secure they are; a quick search found this:

https://security.stackexchange.com/questions/50970/is-it-safe-to-use-createhtmldocument-to-sanitize-html

Community
  • 1
  • 1
Tim Down
  • 318,141
  • 75
  • 454
  • 536