28

Is there any reasons why PHP's json_encode function does not escape all JSON control characters in a string?

For example let's take a string which spans two rows and has control characters (\r \n " / \) in it:

<?php
$s = <<<END
First row.
Second row w/ "double quotes" and backslash: \.
END;

$s = json_encode($s);
echo $s;
// Will output: "First row.\r\nSecond row w\/ \"double quotes\" and backslash: \\."
?>

Note that carriage return and newline chars are unescaped. Why?

I'm using jQuery as my JS library and it's $.getJSON() function will do fine when you fully, 100% trust incoming data. Otherwise I use JSON.org's library json2.js like everybody else. But if you try to parse that encoded string it throws an error:

<script type="text/javascript">

JSON.parse(<?php echo $s ?>);  // Will throw SyntaxError 

</script>

And you can't get the data! If you remove or escape \r \n " and \ in that string then JSON.parse() will not throw error.

Is there any existing, good PHP function for escaping control characters. Simple str_replace with search and replace arrays will not work.

Gustav
  • 390
  • 1
  • 3
  • 8
  • Instead of learn to use =$s?> :) Just my tip. – Thinker Jun 26 '09 at 11:26
  • 25
    Thanks for the tip, but that shortcut echo syntax works only when short_open_tag is enabled and I never have used short opening tags because I prefer – Gustav Jun 26 '09 at 11:35
  • I've edited my answer - it'll work now, or double your money back! – Greg Jun 26 '09 at 12:17
  • 8
    @Thinker Also, if short open tags get turned off on the server then not only does your code break, but it gets output to the browser. – charliefortune Mar 28 '12 at 09:39
  • I assume you setup your servers as you want. Short tags make code cleaner. I also hardly ever seen short tags not working. – Thinker Mar 29 '12 at 09:47
  • 2
    @BadHorsie Check yourself before you make a wrong comment. = is not cosidered bad practice and it is always enabled even with short tags off since PHP 5.4. Also please read http://www.php-fig.org/psr/psr-1/ about coding standards – Thinker Aug 28 '14 at 09:12
  • @BadHorsie just to round this off since I landed on this page, `=` is _not_ affected by and thus not technically a part of 'short open tag'. It is just a shorthand of ``. Most people consider short_open_tags bad practice, my self included, but not `=` – mewm Jan 19 '17 at 19:49
  • 2
    @mewm You are correct and I made my comment years ago when things were different. I have been using short echo tags myself for some time since then. Deleted comments to avoid poor information to future visitors. – BadHorsie Jan 20 '17 at 07:52
  • I don't understand why you need to do JSON.parse on a JSON object. Remember that PHP's json_encode does not the same as JSON.stringify (the first returns a javascript JSON object, the second returns a javascript string). So all you need to do is `var a = ;` (without cuotes) and there you'll have your JSON object stored in a variable, with it's newlines and control characters. – Gonzalingui Mar 21 '17 at 19:56

12 Answers12

31
function escapeJsonString($value) {
    # list from www.json.org: (\b backspace, \f formfeed)    
    $escapers =     array("\\",     "/",   "\"",  "\n",  "\r",  "\t", "\x08", "\x0c");
    $replacements = array("\\\\", "\\/", "\\\"", "\\n", "\\r", "\\t",  "\\f",  "\\b");
    $result = str_replace($escapers, $replacements, $value);
    return $result;
  }

I'm using the above function which escapes a backslash (must be first in the arrays) and should deal with formfeeds and backspaces (I don't think \f and \b are supported in PHP).

TRiG
  • 10,148
  • 7
  • 57
  • 107
Peter Whitefield
  • 311
  • 1
  • 3
  • 2
13

D'oh - you need to double-encode: JSON.parse is expecting a string of course:

<script type="text/javascript">

JSON.parse(<?php echo json_encode($s) ?>);

</script>
Greg
  • 316,276
  • 54
  • 369
  • 333
  • Doesn't escaped newline character look like this: \\n ? Right now \r and \n are not escaped. – Gustav Jun 26 '09 at 11:18
  • 2
    No, escaped newlines are "\n". An unescaped newline would be, well, a new line. – Greg Jun 26 '09 at 11:23
  • 1
    "\\n" would be an escaped backslash followed by an "n" - just try alert('foo\nbar'); and alert('foo\\nbar'); in a regular bit of code. – Greg Jun 26 '09 at 11:25
  • Not correct. var d = JSON.parse('["a\\nb"]'); alert(d); Will alert: a b JSON.parse('["a\nb"]') will throw an error because newline is not escaped. – Gustav Jun 26 '09 at 11:50
  • Last comment doesn't show that a and b are on diffrent rows in alert box. – Gustav Jun 26 '09 at 11:53
  • I've edited my answer. I'm used to outputting JSON as actual JSON not a string! – Greg Jun 26 '09 at 12:14
  • That double encoding does the trick! :D Although it's bit weird to use 2nd nested json_encode: json_encode(json_encode($s)). If anyone is interested then outcome would be: "\"First row.\\r\\nSecond row w\\\/ \\\"double quotes\\\" and backslash: \\\\.\"" Not very "beautiful" but works. I leave this question still open for a while, maybe someone has better solution. Thanks Greg for your help! – Gustav Jun 26 '09 at 12:40
  • 1
    It looks weird but it is correct - The first one says "convert this object to JSON" then the second one says "convert this JSON to a Javascript string" – Greg Jun 26 '09 at 12:43
  • That will also add double quote for you at beginning and end! Thanx Greg! – Vladimir Vukanac Nov 19 '14 at 10:13
4

I still haven't figured out any solution without str_replace..

Try this code.

$json_encoded_string = json_encode(...);
$json_encoded_string = str_replace("\r", '\r', $json_encoded_string);
$json_encoded_string = str_replace("\n", '\n', $json_encoded_string);

Hope that helps...

Curtis
  • 101,612
  • 66
  • 270
  • 352
sp2hari
  • 107
  • 1
  • 7
3
$search = array("\n", "\r", "\u", "\t", "\f", "\b", "/", '"');
$replace = array("\\n", "\\r", "\\u", "\\t", "\\f", "\\b", "\/", "\"");
$encoded_string = str_replace($search, $replace, $json);

This is the correct way

JunioR
  • 39
  • 1
2

Converting to and fro from PHP should not be an issue. PHP's json_encode does proper encoding but reinterpreting that inside java script can cause issues. Like

1) original string - [string with nnn newline in it] (where nnn is actual newline character)

2) json_encode will convert this to [string with "\\n" newline in it] (control character converted to "\\n" - Literal "\n"

3) However when you print this again in a literal string using php echo then "\\n" is interpreted as "\n" and that causes heartache. Because JSON.parse will understand a literal printed "\n" as newline - a control character (nnn)

so to work around this: -

A) First encode the json object in php using json_enocde and get a string. Then run it through a filter that makes it safe to be used inside html and java script.

B) use the JSON string coming from PHP as a "literal" and put it inside single quotes instead of double quotes.


<?php
       function form_safe_json($json) {
            $json = empty($json) ? '[]' : $json ;
            $search = array('\\',"\n","\r","\f","\t","\b","'") ;
            $replace = array('\\\\',"\\n", "\\r","\\f","\\t","\\b", "&#039");
            $json = str_replace($search,$replace,$json);
            return $json;
        }


        $title = "Tiger's   /new \\found \/freedom " ;
        $description = <<<END
        Tiger was caged
        in a Zoo 
        And now he is in jungle
        with freedom
    END;

        $book = new \stdClass ;
        $book->title = $title ;
        $book->description = $description ;
        $strBook = json_encode($book);
        $strBook = form_safe_json($strBook);

        ?>


    <!DOCTYPE html>
    <html>

        <head>
            <title> title</title>

            <meta charset="utf-8">


            <script type="text/javascript" src="/3p/jquery/jquery-1.7.1.min.js"></script>


            <script type="text/javascript">
                $(document).ready(function(){
                    var strBookObj = '<?php echo $strBook; ?>' ;
                    try{
                        bookObj = JSON.parse(strBookObj) ;
                        console.log(bookObj.title);
                        console.log(bookObj.description);
                        $("#title").html(bookObj.title);
                        $("#description").html(bookObj.description);
                    } catch(ex) {
                        console.log("Error parsing book object json");
                    }

                });
            </script>

        </head>

         <body>

             <h2> Json parsing test page </h2>
             <div id="title"> </div>
             <div id="description"> </div>
        </body>
    </html>

Put the string inside single quote in java script. Putting JSON string inside double quotes would cause the parser to fail at attribute markers (something like { "id" : "value" } ). No other escaping should be required if you put the string as "literal" and let JSON parser do the work.

rjha94
  • 4,292
  • 3
  • 30
  • 37
1

I don't fully understand how var_export works, so I will update if I run into trouble, but this seems to be working for me:

<script>
    window.things = JSON.parse(<?php var_export(json_encode($s)); ?>);
</script>
colllin
  • 9,442
  • 9
  • 49
  • 65
0

Maybe I'm blind, but in your example they ARE escaped. What about

<script type="text/javascript">

JSON.parse("<?php echo $s ?>");  // Will throw SyntaxError 

</script>

(note different quotes)

nothrow
  • 15,882
  • 9
  • 57
  • 104
  • 2
    You can't use double quotes there, because after echoing the string it will be: JSON.parse(""... and so on. – Gustav Jun 26 '09 at 11:08
0

Just an addition to Greg's response: the output of json_encode() is already contained in double-quotes ("), so there is no need to surround them with quotes again:

<script type="text/javascript">
    JSON.parse(<?php echo $s ?>);
</script>
Community
  • 1
  • 1
Stefan Gehrig
  • 82,642
  • 24
  • 155
  • 189
0

Control characters have no special meaning in HTML except for new line in textarea.value . JSON_encode on PHP > 5.2 will do it like you expected.

If you just want to show text you don't need to go after JSON. JSON is for arrays and objects in JavaScript (and indexed and associative array for PHP).

If you need a line feed for the texarea-tag:

$s=preg_replace('/\r */','',$s);
echo preg_replace('/ *\n */','&#13;',$s);
B.F.
  • 477
  • 6
  • 9
0

This is what I use personally and it's never not worked. Had similar problems originally.

Source script (ajax) will take an array and json_encode it. Example:

$return['value'] = 'test';
$return['value2'] = 'derp';

echo json_encode($return);

My javascript will make an AJAX call and get the echoed "json_encode($return)" as its input, and in the script I'll use the following:

myVar = jQuery.parseJSON(msg.replace(/&quot;/ig,'"'));

with "msg" being the returned value. So, for you, something like...

var msg = '<?php echo $s ?>';
myVar = jQuery.parseJSON(msg.replace(/&quot;/ig,'"'));

...might work for you.

0

There are 2 solutions unless AJAX is used:

  1. Write data into input like and read it in JS:

    <input type="hidden" value="<?= htmlencode(json_encode($data)) ?>"/>
    
  2. Use addslashes

    var json = '<?= addslashes(json_encode($data)) ?>';
    
Pang
  • 9,564
  • 146
  • 81
  • 122
user2434435
  • 127
  • 5
-1

When using any form of Ajax, detailed documentation for the format of responses received from the CGI server seems to be lacking on the Web. Some Notes here and entries at stackoverflow.com point out that newlines in returned text or json data must be escaped to prevent infinite loops (hangs) in JSON conversion (possibly created by throwing an uncaught exception), whether done automatically by jQuery or manually using Javascript system or library JSON parsing calls.

In each case where programmers post this problem, inadequate solutions are presented (most often replacing \n by \\n on the sending side) and the matter is dropped. Their inadequacy is revealed when passing string values that accidentally embed control escape sequences, such as Windows pathnames. An example is "C:\Chris\Roberts.php", which contains the control characters ^c and ^r, which can cause JSON conversion of the string {"file":"C:\Chris\Roberts.php"} to loop forever. One way of generating such values is deliberately to attempt to pass PHP warning and error messages from server to client, a reasonable idea.

By definition, Ajax uses HTTP connections behind the scenes. Such connections pass data using GET and POST, both of which require encoding sent data to avoid incorrect syntax, including control characters.

This gives enough of a hint to construct what seems to be a solution (it needs more testing): to use rawurlencode on the PHP (sending) side to encode the data, and unescape on the Javascript (receiving) side to decode the data. In some cases, you will apply these to entire text strings, in other cases you will apply them only to values inside JSON.

If this idea turns out to be correct, simple examples can be constructed to help programmers at all levels solve this problem once and for all.

David Spector
  • 1,520
  • 15
  • 21