0

Is there a way in JavaScript or MooTools to retrieve the actual text in the value from an input element without the browser interpreting any html special entites? Please see the example included below. My desired outcome is:

<div id="output">
   <p>Your text is: <b>[&lt;script&gt;alert('scrubbed');&lt;/script&gt;]</b></p>
</div>

Note that it works if I type/copy &lt;script&gt;alert('scrubbed');&lt;/script&gt; directly into the text input box, but fails if I insert right after loading the page.

<html>
<head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8">
    <title>scrubtest</title>
</head>
<body id="scrubtest" onload="">
    <script type="text/javascript" language="JavaScript" src="/js/mootools-core.js"></script>

    <input type="text" name="scrubtext" value="&lt;script&gt;alert('scrubbed');&lt;/script&gt;" id="scrubtext"/><br />
    <input type="button" value="Insert" onclick="insertText();"/><br />

    <input type="button" value="Get via MooTools" onclick="alert($('scrubtext').get('value'));"/><br />
    <input type="button" value="Get via JavaScript" onclick="alert(document.getElementById('scrubtext').value);"/><br />

    <div id="output">
    </div>

    <script type="text/javascript" charset="utf-8">
        function insertText()
        {
            var stext = $('scrubtext').get('value');
            var result = new Element( 'p', {html: "Your text is: <b>["+stext+"]</b>"} );
            result.inject($('output'));
        }
    </script>

</body>
</html>

4 Answers4

2
{html: "Your text is: <b>["+stext+"]</b>"}

That's your problem: you're taking a plain text string and adding it into HTML markup. Naturally any < characters in the text string will become markup, and you give yourself a potential client-side cross-site-scripting vulnerability.

You can HTML-escape, but there's no built-in function to do it in JS, so you have to define it yourself, eg.:

// HTML-encode a string for use in text content or an attribute value delimited by
// double-quotes
//
function HTMLEncode(s) {
    return s.replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/"/g, '&quot;');
}

...

var result = new Element('p', {html: "Your text is: <b>["+HTMLEncode(stext)+"]</b>"});

However, it is generally simpler to use DOM methods to add plain text without the bother of string hacking. I believe Moo would do it like this:

var bold= new Element('b', {text: stext});
var result= new Element('p', {text: 'Your text is: '});
bold.inject(result);
bobince
  • 528,062
  • 107
  • 651
  • 834
0

escape & like this: &amp;

<input type="text" name="scrubtext" value="&amp;lt;script&amp;gt;alert('scrubbed');&amp;lt;/script&amp;gt;" id="scrubtext"/>
David Murdoch
  • 87,823
  • 39
  • 148
  • 191
  • While that does work, it doesn't feel right to me to have to "sanitize" the text that's already been sanitized once. The value of the input is a result from server-side (php) validation: $destination['transit_time'] = trim( htmlspecialchars( stripslashes( $destination['transit_time'] ), ENT_QUOTES )); I just want it to stay sanitized as I move it around the document. – CaptainQwyx Apr 20 '10 at 21:15
  • You aren't sanitizing it. You are HTML escaping it. `&` must be escaped to be displayed in HTML (browsers are [sometimes] forgiving and display it even if you don't HTML encode it). Its a rule. Follow it. – David Murdoch Apr 20 '10 at 21:34
  • But that is exactly what you want. When you write (simpified) `` then the value of the input **is** `<`. It you want the value to be `<` then you have to escape it to `&lt;`. That is simply how its works, and there is nothing wrong about it. – RoToRa Apr 20 '10 at 21:34
  • @RoToRa: Ok I guess that makes sense. I was thinking there was a difference between and and hoping there was some way for me to preserve that difference. But thank you both, and everyone else who answered. – CaptainQwyx Apr 20 '10 at 23:24
0

You can change the & characters into &amp by using

var result = new Element( 'p', {html: "Your text is: <b>["+stext.replace(/&/g,'&amp')+"]</b>"} );

Addition: I would go with bobince on the benefit of using the DOM node properties, instead of injecting arbitrary HTML.

MasterAM
  • 16,283
  • 6
  • 45
  • 66
-1
function htmlspecialchars_decode(text)
{
var stub_object = new Element('span',{ 'html':text });
var ret_val = stub_object.get('text');
delete stub_object;
return ret_val;
}
  • `delete` isn't expected to work with variable names, but with properties of an object (http://stackoverflow.com/questions/7009115/behavior-of-delete-operator-in-javascript) – Anderson Pimentel Sep 06 '13 at 01:44