7

When I submit/POST data to the server, I need to HTMLencode its characters (the relevant ones), since disabling input check by setting validationRequest = false is not a good practice.

All solutions are finally replacing chars in string:

This is what i've written.

function htmlEncode(str) {
    str = str.replace(/\&/g, "&");
    str = str.replace(/\</g, "&lt;");
    str = str.replace(/\>/g, "&gt;");
    str = str.replace(/ /g, "&nbsp;");
    return str;
}

But apprently regex could be replaced with something much faster (don't get me wrong - I love regex).

Also, working with indexes + sub-strings seems wasteful.

What is the fastest way of doing it?

vzwick
  • 11,008
  • 5
  • 43
  • 63
Royi Namir
  • 144,742
  • 138
  • 468
  • 792
  • 5
    *disabling input check by setting validationRequest = false - is not a good practice* — Hacking around a security filter that rejects data you want to accept is worse practise. Set up your security filters to access the type of content you want to accept instead of accepting defaults designed to protect people who don't know what they are doing. – Quentin Sep 24 '12 at 09:19
  • 2
    http://stackoverflow.com/questions/1219860/javascript-jquery-html-encoding – Erich Kitzmueller Sep 24 '12 at 09:20
  • 1
    http://stackoverflow.com/questions/1219860/javascript-jquery-html-encoding (edit: @ammoQ - heh!) – vzwick Sep 24 '12 at 09:21
  • @Quentin it is not recommanded http://books.google.co.il/books?id=QJNoykS0Tv4C&pg=PT84&lpg=PT84&dq=%22Another+approach+you+could+take+is+to+disable+request%22&source=bl&ots=JNalmbHtnV&sig=TFAcCLdRkgzHJMWsE7fzd5pWCtA&hl=en&sa=X&ei=5CdgULq0FLPI0AXi2oGICw&redir_esc=y#v=onepage&q=%22Another%20approach%20you%20could%20take%20is%20to%20disable%20request%22&f=false from asp.net security book – Royi Namir Sep 24 '12 at 09:29
  • @RoyiNamir — That says it is a bad idea to turn it off "site-wide" not "when you need it". First it tells you how to do it on a page-by-page basis, then it tells you it is a bad idea to do it site-wide, then it tells you how to do it site wide. – Quentin Sep 24 '12 at 09:31
  • @Quentin looking at fiddler on SO and facebook - they do heml encode it before submit – Royi Namir Sep 24 '12 at 09:32
  • @gdoron yeah. thats distinguish good programmer to excellent programmer. – Royi Namir Sep 24 '12 at 09:33
  • @RoyiNamir — Just edited one of my answers (and added `<` and `>` characters). No sign of any HTML encoding in the submitted data. – Quentin Sep 24 '12 at 09:36
  • 1
    @RoyiNamir — Good/Great programmers don't micro-optimise until code profiling says they need to. They write code designed to maximise maintainability. – Quentin Sep 24 '12 at 09:37
  • @Quentin regex replace wont give best performance. which will do ? Thats my question – Royi Namir Sep 24 '12 at 09:38
  • 1
    @RoyiNamir — Not HTML encoding on the client in the first place will give the best performance on the client. – Quentin Sep 24 '12 at 09:38

3 Answers3

12
function htmlEncode(str) {
    return String(str)
            .replace(/&/g, '&amp;')
            .replace(/"/g, '&quot;')
            .replace(/'/g, '&#39;')
            .replace(/</g, '&lt;')
            .replace(/>/g, '&gt;');
}

jsperf tests show this method is fast and possibly the fastest option if you're in a recent browser version

anothre way to also like this

function htmlEncode(value){
  return $('<div/>').text(value).html();
}

function htmlDecode(value){
  return $('<div/>').html(value).text();
}
Sender
  • 6,660
  • 12
  • 47
  • 66
  • this wont handle multi space. – Royi Namir Sep 24 '12 at 09:28
  • This works for most scenarios, but this implementation of htmlDecode will eliminate any extra whitespace. So for some values of "input", input != htmlDecode(htmlEncode(input)). This was a problem for us in some scenarios. For example, if input = "

    \t Hi \n There

    ", a roundtrip encode/decode will yield "

    Hi There

    – Royi Namir Sep 24 '12 at 09:34
  • i think get text and `trim()` function will help you. or my be ` .replace(/ /g, ' ')` – Sender Sep 24 '12 at 09:45
  • I had to replace ' with ' for apostrophes to avoid the .net "potentially dangerous" error. – AndyMcKenna Dec 28 '15 at 13:54
-1

If you are just encoding HTML entities, you can try:

function htmlEncode(str) {
    var d = document.createElement('b');
    d.innerText = str;
    return d.innerHTML;
}

This way is not the fastest. This test indicates that regExp is faster: http://jsperf.com/encodehtml

However, the difference seems to be smaller the more HTML you consume.

The innerText method seems more reliable as it will exploit the native browser conversion tables for entities. With RegExp, there is always a chance that you missed something and as some previous answers indicate, consuming HTML using RegExp is not always optimal.

Community
  • 1
  • 1
David Hellsing
  • 106,495
  • 44
  • 176
  • 212
-1
function htmlEncode(value){
    if (value) {
        return jQuery('<div />').text(value).html();
    }
    return '';
}
 
function htmlDecode(value) {
    if (value) {
        return $('<div />').html(value).text();
    }
    return '';
}
Luís Mestre
  • 1,851
  • 1
  • 11
  • 29
GajendraSinghParihar
  • 9,051
  • 11
  • 36
  • 64
  • This works for most scenarios, but this implementation of htmlDecode will eliminate any extra whitespace. So for some values of "input", input != htmlDecode(htmlEncode(input)). This was a problem for us in some scenarios. For example, if input = "

    \t Hi \n There

    ", a roundtrip encode/decode will yield "

    Hi There

    – Royi Namir Sep 24 '12 at 09:34