First things first. I would be severely re-miss if I failed to point out that accepting raw HTML from your users is, generally-speaking, not a good idea™.
Doing this incorrectly (and it is an extremely difficult task to do correctly) leaves your site, and your users, open to many vulnerabilities. You can view a partial list of them at https://html5sec.org/ (I say partial because they're only listing the "known" attack vectors). There are a lot of good answers to a seemingly-unrelated, but definitely semi-related question and I strongly recommend that you read them all.
"But @Pete!", I hear you cry, "My users are trustworthy. They won't try to click-jack my other users, or do anything else malicious or untowards!"
You may be suffering under the delusion that everyone who uses your site will not be malicious, or will even be using a browser to submit HTML to your site (so don't forget server-side validation and sanitization).
Then again, you may not be deluded and your userbase has a vested interest in only submitting safe HTML for your site. Maybe you've already considered, and implemented, bullet-proof client-side and server-side validation and sanitization routines. I don't know your exact circumstances and I won't pretend to (although I do know the probabilities involved here are not in your favor).
With all of the above in mind, if you still insist on allowing a user to write and submit raw HTML to your site, consider:
- using the documentation found at https://validator.w3.org/docs/api.html to fire off an AJAX request and validate the HTML being submitted;
- using a plugin/library for a Rich Text Editor that lets the user enter in formatted text like they would in a word processor and gives you a resulting HTML string to send to your server.
- using a plugin/library for a Markdown parser (like the one you use here at SO).
You could also just convert the user's HTML to a DOM element (allowing the browser to parse the HTML into an actual DOM element) and then grab the [parsed] HTML string back:
window.addEventListener('load', function () {
var textarea = document.getElementById('unsafe-html');
var button = document.getElementById('get-unsafe-html');
var getUnsafeHtml = function getUnsafeHtml() {
var div = document.createElement('div');
div.innerHTML = textarea.value; // parses HTML to DOM elements
return div.innerHTML; // gets it back in a string form.
}
button.addEventListener('click', function (e) {
var unsafeHtml = getUnsafeHtml();
console.log(unsafeHtml);
e.preventDefault();
return false;
}, false);
}, false);
<textarea id="unsafe-html" rows="5">
<p>If you <strong>insist</strong>, <i>then</i> this technique can be used as well.</p>
</textarea>
<button id="get-unsafe-html">Get the HTML</button>
This will not ensure that the markup is the way the user intended it to be, but it will ensure you don't have unmatched tags (as they will be either auto-closed, or removed, depending on the browser).