2

The jQuery form .serialize() method serializes form contents to a string and automatically URL-encodes the string. My server then reverses this process and URL-decodes the string when deserializing it.

But what I need to be able to do is to HTML-encode the form contents before the form is serialized. In other words, if a user enters HTML into a text input in my form, I want this to be made safe using HTML encoding, then transmitted exactly as described above (using URL-encoding as normal).

Let me illustrate with an example:

Current implementation using .serialize()

  1. User enters My name is <b>Fred</b> into a form input with name Details.
  2. .serialize() serializes this as Details=My+name+is+%3Cb%3EFred%3C%2Fb%3E (URL-encoding)
  3. The server deserializes the string and gets My name is <b>Fred</b> (URL-decoding)

What I want to happen

  1. User enters My name is <b>Fred</b> into a form input with name Details.
  2. This gets HTML-encoded to My name is &lt;b&gt;Fred&lt;/b&gt; (HTML-encoding)
  3. .serialize() serializes this as Details=My+name+is+%26lt%3Bb%26gt%3BFred%26lt%3B%2Fb%26gt%3B (URL-encoding)
  4. The server URL-decodes the string and gets My name is &lt;b&gt;Fred&lt;/b&gt; (URL-decoding only)

I was hoping that .serialize() might take an argument to specify that the form contents should be HTML-encoded, but no such luck. A couple of other possible solutions would be:

  1. Iterate through the form inputs and HTML-encode them "by hand" before calling .serialize(): I'd rather not have to do this as it will make the code messier and less robust.
  2. Modify my server to accept non-HTML-encoded values: for various reasons I won't go into this is problematic and not a practical solution.

Is there a simpler solution?

Mark Whitaker
  • 8,465
  • 8
  • 44
  • 68
  • 1
    just to clarify, you can't encode on the server side? Someone could disable JS and submit the form (if it works) – uv_man Apr 16 '15 at 08:34
  • 1
    You can use `serializeArray` instead and process the resulting array. (I'm not sure, but I don't think serializeArray encodes the values) – Felix Kling Apr 16 '15 at 08:34
  • This usually requires a server-side solution. ASP.Net/MVC does this automatically (refuses unsafe data unless specifically allowed). Are you PHP or .Net based? – iCollect.it Ltd Apr 16 '15 at 08:36
  • @uv_man No, see my comment 2 above. The site is inoperative without JavaScript, so the scenario you describe (while usually valid) isn't a concern here. – Mark Whitaker Apr 16 '15 at 08:38
  • @TrueBlueAussie I'm using MVC and you're right, it is rejecting un-escaped HTML during deserialization. That's the problem! – Mark Whitaker Apr 16 '15 at 08:39
  • @TrueBlueAussie Unfortunately not helpful in this case - see below. – Mark Whitaker Apr 16 '15 at 08:56

5 Answers5

2

The solution is to use jQuery's .serializeArray() and apply the HTML-encoding to each element in a loop.

In other words, I had to change this:

$.ajax({
    url: form.attr('action'),
    async: false,
    type: 'POST',
    data: form.serialize(),
    success: function (data) {
        //...
    }
});

to this:

// HTML-encode form values before submitting
var data = {};
$.each(form.serializeArray(), function() {
    data[this.name] = this.value
        .replace(/&/g, '&amp;')
        .replace(/"/g, '&quot;')
        .replace(/'/g, '&#39;')
        .replace(/</g, '&lt;')
        .replace(/>/g, '&gt;');
});

$.ajax({
    url: form.attr('action'),
    async: false,
    type: 'POST',
    data: data,
    success: function (data) {
        //...
    }
});
Mark Whitaker
  • 8,465
  • 8
  • 44
  • 68
  • 1
    For others stumbling on this, always be sure to include server validation for XSS as well. Whatever can be done in JavaScript can always be undone in JavaScript. – Jason W Aug 23 '17 at 13:32
1

As you are using MVC (see comments), simply apply the [AllowHtml] attribute above the single property that requires it.

You will need to add the following using statement if not already present:

using System.Web.Mvc;

Note: If you are also using a MetadataTypeAttribute it may not work out of the box (but unlikely to be a problem in this case)

Update

From comments, as you cannot modify the form data properties (dynamic forms), you can turn it off in the controller using the following on the controller action

[ValidateInput(false)] 

You can also change the setting for the entire server (less secure). See this blog entry:

http://weblogs.asp.net/imranbaloch/handling-validateinputattribute-globally

iCollect.it Ltd
  • 92,391
  • 25
  • 181
  • 202
  • See my comment in the original question: "Modify my server to accept non-HTML-encoded values: for various reasons I won't go into this is problematic and not a practical solution." The form is dynamically generated and processed, so there's no property to add the attribute to. – Mark Whitaker Apr 16 '15 at 08:53
  • @Mark Whitaker: The only server-side alternative is to turn it off for the controller, or for the server. Are either of these an option? – iCollect.it Ltd Apr 16 '15 at 08:55
  • Not really. That's why I'd prefer a client-side solution. – Mark Whitaker Apr 16 '15 at 09:04
  • Thanks anyway though: your answer will surely be very helpful to other users who have the same problem but a more flexible server setup. – Mark Whitaker Apr 16 '15 at 09:04
0

Input values will always get encoded by default. As you stated, you have to iterate through each values to decode first. You can use the following jQuery snippet to do that:

$('<div/>').html(value).text();
wintercounter
  • 7,500
  • 6
  • 32
  • 46
  • The `$('div').html(value).html()` trick isn't completely reliable: see [this answer](http://stackoverflow.com/a/7124052/440921) for a better solution. – Mark Whitaker Apr 16 '15 at 09:06
0

One option might be to update the jquery library directly and call htmlEncode on the dom value, before the uriEncode happens.

I tested this in a ASP.NET/MVC app and the line I updated in jquery-1.8.2.js (line 7222, depending version) was:

        s[ s.length ] = encodeURIComponent( key ) + "=" + encodeURIComponent( value );

to

        s[ s.length ] = encodeURIComponent( key ) + "=" + encodeURIComponent( htmlEncode(value) );

Use whichever htmlEncode method you find suitable, but it appears to work.

Might actually make more sense to extend this method out and call a customSerialize method which does the htmlEncode.

I believe this is the simplest way and means you don't have to iterate through the dom before calling serialize.

uv_man
  • 224
  • 1
  • 3
-1

A string has to be html-encoded after any other changes like url-encoding or the sql string-escape.
So you first serialize your string, use it in links and after deserializing you html-encode it. Just do it as before but use the function below.

Why is this important?

Because I can enter in the url myself a non-html-escaped string and then can give it to you. You would think it's escaped, but it wouldn't. The solution is to escape it just before printing it on the page.

This question describes how to html-escape a string: HtmlSpecialChars equivalent in Javascript?

function escapeHtml(text) {
  var map = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '"': '&quot;',
    "'": '&#039;'
  };

  return text.replace(/[&<>"']/g, function(m) { return map[m]; });
}
Community
  • 1
  • 1
Al.G.
  • 4,327
  • 6
  • 31
  • 56
  • Not true: it depends how you're using the string. Read through my example again and hopefully you'll see that in this case HTML encoding has to come first. (Because URL encoding is effectively being done transparently by the transport layer, and the result I want at the other end is an HTML-encoded version of the user's input.) – Mark Whitaker Apr 16 '15 at 08:55
  • @MarkWhitaker I'm sorry I didn't understand your question correctly but what will happen if you enter in the url a serialized but not html-encoded string? Will it be encoded? I don't think so (or I'm wrong?) – Al.G. Apr 16 '15 at 09:03