Prevent XSS attacks and still use Html.Raw

Question

I have CMS system where I am using CK Editor to enter data. Now if user types in <script>alert('This is a bad script, data');</script> then CKEditor does the fair job and encodes it correctly and passes <script>alert('This is a bad script, data')</script> to server.

But if user goes into browser developer tools (using Inspect element) and adds this inside it as shown in the below screen shot then this is when all the trouble starts. Now after retrieving back from DB when this is displayed in Browser it presents alert box.

Edit CKEditor contents thru inspect element

So far I have tried many different things one them is

Encode the contents using AntiXssEncoder [HttpUtility.HtmlEncode(Contents)] and then store it in database and when displaying back in browser decode it and display it using MvcHtmlString.Create [MvcHtmlString.Create(HttpUtility.HtmlDecode(Contents))] or Html.Raw [Html.Raw(Contents)] as you may expect both of them displays JavaScript alert.

I don't want to replace the <script> manually thru code as it is not comprehensive solution (search for "And the encoded state:").

So far I have referred many articles (sorry not listing them all here but just adding few as proof to show I have put sincere efforts before writing this question) but none of them have code which shows the answer. May be there is some easy answer and I am not looking in right direction or may be it is not that simple at all and I may need to use something like Content Security Policy.

ASP.Net MVC Html.Raw with AntiXSS protection Is there a risk in using @Html.Raw? http://blog.simontimms.com/2013/01/21/content-security-policy-for-asp-net-mvc/ http://blog.michaelckennedy.net/2012/10/15/understanding-text-encoding-in-asp-net-mvc/

To reproduce what I am saying go to *this url and in the text box type <script>alert('This is a bad script, data');</script> and click the button.

*This link is from Michael Kennedy's blog

score 5 · Answer 1 · answered May 12 '17 at 15:23

I managed to resolve this issue using the HtmlSanitizer in NuGet:

https://github.com/mganss/HtmlSanitizer

as recommended by the OWASP Foundation (as good a recommendation as I need):

https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#RULE_.236_-_Sanitize_HTML_Markup_with_a_Library_Designed_for_the_Job

First, add the NuGet Package:

> Install-Package HtmlSanitizer

Then I created an extension method to simplify things:

using Ganss.XSS;

...

public static string RemoveHtmlXss(this string htmlIn, string baseUrl = null)
{
    if (htmlIn == null) return null;
    var sanitizer = new HtmlSanitizer();
    return sanitizer.Sanitize(htmlIn, baseUrl);
}

I then validate within the controller when the HTML is posted:

var cleanHtml = model.DodgyHtml.RemoveHtmlXss();

AND for completeness, sanitise whenever you present it to the page, especially when using Html.Raw():

<div>@Html.Raw(Model.NotSoSureHtml.RemoveHtmlXss())</div>

Michael Levy · Answer 2 · 2015-07-16T21:38:32.553

It isn't easy and you probably don't want to do this. May I suggest you use a simpler language than HTML for end user formatted input? What about Markdown which (I believe) is used by Stackoverflow. Or one of the existing Wiki or other lightweight markup languages?

If you do allow Html, I would suggest the following:

only support a fixed subset of Html
after the user submits content, parse the Html and filter it against a whitelist of allowed tags and attributes.
be ruthless in filtering and eliminating anything that you aren't sure about.

There are existing tools and libraries that do this. I haven't used it, but I did stumble on http://htmlpurifier.org/. I assume there are many others. Rick Strahl has posted one example for .NET, but I'm not sure if it is complete.

About ten years ago I attempted to write my own whitelist filter. It parsed and normalized the entered Html. Then it removed any elements or attributes that were not on the allowed whitelist. It worked pretty well, but you never know what vulnerabilities you've missed. That project is long dead, but if I had to do it over I would have used an existing simpler markup language rather than Html.

There are so many ways for users to inject nasty stuff into your pages, you have to be fierce to prevent this. Even CSS can be used to inject executable expressions into your page, like:

<STYLE type="text/css">BODY{background:url("javascript:alert('XSS')")}</STYLE>

Here is a page with a list of known attacks that will keep you up at night. If you can't filter and prevent all of these, you aren't ready for untrusted users to post formatted content viewable by the public.

Right around the time I was working on my own filter, MySpace (wow I'm old) was hit by an XSS Worm known as Samy. Samy used Style attributes with embedded background Url that had a javascript payload. It is all explained by the author.

Note that your example page says:

This page is meant to accept and display raw HTML by trusted editors.

The key issue here is trust. If all of your users are trusted (say employees of a web site), then the risk here is lower. However, if you are building a forum or social network or dating site or anything that allows untrusted users to enter formatted content that will be viewable by others, you have a difficult job to sanitize Html.

+1 for links, unfortunately I don't have liberty to replace CKEditor also CKEditor is not the trouble maker if you are typing inside it then I am not worried at all, in CKEditor I have put strict restrictions using combination of [extraAllowedContent](http://docs.ckeditor.com/#!/api/CKEDITOR.config-cfg-extraAllowedContent) and [allowedContent](http://docs.ckeditor.com/#!/api/CKEDITOR.config-cfg-allowedContent). The problem is if somebody goes inside browser tools [(as shown in the screen shot)](http://i.stack.imgur.com/1Cpa9.png) and then submits the contents. — ndd, Jul 17 '15 at 13:05
Yes, you must validate inputs on the server. You shouldn't just trust the UI. Someone may use Dev tools (like you've said) or someone may write scripts and simply post data to your controller. Just like you would validate other inputs, you must validate the Html input. In the situation where you accept formatted html input from untrusted users, you must filter the submitted Html. You can't just trust the editor. The filtering suggestions above are not a replacement for the editor. The suggestion is to filter the received data on the server before it is saved. — Michael Levy, Jul 17 '15 at 13:37

Prevent XSS attacks and still use Html.Raw

2 Answers2

Linked