26

Sample code:

<?php

$json = "['foo', 'bar']";

var_dump( json_decode($json) );

It works with PHP 5.5.3 but it fails for lower PHP's versions

It works on my machine with PHP 5.5.3 but it fails everywhere else.

I know it is incorrect JSON but my webservice gives me JSON with ' symbols together with "

['foo', "bar", {'test': "crazy \"markup\""}]

Sandbox

How to parse JSON data with apostrophes in PHP 5.3? Obviously original JSON I want to parse is more complex.

(I can't upgrade my PHP on production server neither get proper JSON from webservice)

Peter
  • 16,453
  • 8
  • 51
  • 77
  • 2
    IMHO, ff PHP/5.5.3 parses invalid JSON it's probably a bug. – Álvaro González Dec 03 '13 at 10:21
  • 1
    It appears your code sample doesn't work for any version, http://3v4l.org/Hl99u – Anthony Sterling Dec 03 '13 at 10:22
  • 1
    What does the broken webservice return if a string contains `"` or `'`? Have you let the service know they're serving bad json? – Eric Dec 03 '13 at 10:23
  • @AnthonySterling it works for `PHP 5.5.3-1ubuntu2 (cli)` – Peter Dec 03 '13 at 10:24
  • @Eric sometimes it uses `'`, sometimes `"` – Peter Dec 03 '13 at 10:25
  • I've had similar issues in the past with providers that would send invalid XML. In the end, I had to use string manipulation functions to fix errors as I discover them before the actual parsing. – Álvaro González Dec 03 '13 at 10:25
  • That doesn't answer my question. When a json string _value_ contains either type of quote, are they escaped? – Eric Dec 03 '13 at 10:28
  • 3
    Unfortunately invalid data is a big problem, my 0,02 cent is talk with the provider if you can, I had also XML troubles and had to do a lot of text parsing to fix their invalid formatting (for example they used non escaped ampersands). – Ende Neu Dec 03 '13 at 10:30
  • @Eric: it's crazy mix like `['foo', {"bar": "hel'lo", "foo": 'ba"r'}]` – Peter Dec 03 '13 at 10:31
  • Is this a real example ? How this (`["foo", 'bar', {'test': "hello \" world"}]`) is going to be a valid string within `quotes`, can you echo this `json` string ? – The Alpha Dec 06 '13 at 10:44
  • [Check this answer](http://stackoverflow.com/questions/1575198/invalid-json-parsing-using-php). – The Alpha Dec 06 '13 at 10:46
  • @SheikhHeera no it's not, sorry. just imagine mix quotes / apostrophes – Peter Dec 06 '13 at 10:48
  • As a bad hack, wouldn't it work if you first escape (replace) " with \" and then replace all ' by "? – Tobi Dec 06 '13 at 10:53
  • @Sheikh it doesn't work http://codepad.viper-7.com/GRgr2m – Peter Dec 06 '13 at 10:55
  • @Tobi unfortunately it's a mix of " and ' – Peter Dec 06 '13 at 10:56
  • The glyphs are are colloquially known in both PHP, and Javascript as single and double quotes — calling them apostrophe and quotation marks is confusing: http://www.php.net/manual/en/language.types.string.php & https://developer.mozilla.org/en-US/docs/Web/CSS/string – Mark Fox Apr 21 '14 at 00:46

7 Answers7

36

Here's an alternative solution to this problem:

function fixJSON($json) {
    $regex = <<<'REGEX'
~
    "[^"\\]*(?:\\.|[^"\\]*)*"
    (*SKIP)(*F)
  | '([^'\\]*(?:\\.|[^'\\]*)*)'
~x
REGEX;

    return preg_replace_callback($regex, function($matches) {
        return '"' . preg_replace('~\\\\.(*SKIP)(*F)|"~', '\\"', $matches[1]) . '"';
    }, $json);
}

This approach is more robust than h2ooooooo's function in two respects:

  • It preserves double quotes occurring in a single quoted string, by applying additional escaping to them. h2o's variant will replace them with double quotes instead, thus changing the value of the string.
  • It will properly handle escaped double quotes \", for which h2o's version seems to go into an infinite loop.

Test:

$brokenJSON = <<<'JSON'
['foo', {"bar": "hel'lo", "foo": 'ba"r ba\"z', "baz": "wor\"ld ' test"}]
JSON;

$fixedJSON = fixJSON($brokenJSON);
$decoded = json_decode($fixedJSON);

var_dump($fixedJSON);
print_r($decoded);

Output:

string(74) "["foo", {"bar": "hel'lo", "foo": "ba\"r ba\"z", "baz": "wor\"ld ' test"}]"
Array
(
    [0] => foo
    [1] => stdClass Object
        (
            [bar] => hel'lo
            [foo] => ba"r ba"z
            [baz] => wor"ld ' test
        )
)
NikiC
  • 100,734
  • 37
  • 191
  • 225
6

Here's a simple parser that'll fix your quotes for you. If it encounters a ' quote which isn't in a double quote ", it'll assume that it's wrong and replace the double quotes inside of that quote, and turn the quote enclosured into double quotes:

Example:

<?php
    function fixJSON($json) {
        $newJSON = '';

        $jsonLength = strlen($json);
        for ($i = 0; $i < $jsonLength; $i++) {
            if ($json[$i] == '"' || $json[$i] == "'") {
                $nextQuote = strpos($json, $json[$i], $i + 1);
                $quoteContent = substr($json, $i + 1, $nextQuote - $i - 1);
                $newJSON .= '"' . str_replace('"', "'", $quoteContent) . '"';
                $i = $nextQuote;
            } else {
                $newJSON .= $json[$i];
            }
        }

        return $newJSON;
    }

    $brokenJSON = "['foo', {\"bar\": \"hel'lo\", \"foo\": 'ba\"r'}]";
    $fixedJSON = fixJSON( $brokenJSON );

    var_dump($fixedJSON);

    print_r( json_decode( $fixedJSON ) );
?>

Output:

string(41) "["foo", {"bar": "hel'lo", "foo": "ba'r"}]"
Array
(
    [0] => foo
    [1] => stdClass Object
        (
            [bar] => hel'lo
            [foo] => ba'r
        )

)

DEMO

h2ooooooo
  • 39,111
  • 8
  • 68
  • 102
4

NikiCs´ answer is already spot on. Your input seems to be manually generated, so it's entirely possible that within ' single quoted strings, you'll receive unquoted " doubles. A regex assertion is therefore advisable instead of a plain search and replace.

But there are also a few userland JSON parsers which support a bit more Javascript expression syntax. It's probably best to speak of JSOL, JavaScript Object Literals, at this point.

PEARs Services_JSON

Services_JSON can decode:

  • unquoted object keys
  • and strings enclosed in single quotes.

No additional options are required, just = (new Services_JSON)->decode($jsol);

up_json_decode() in upgradephp

This was actually meant as fallback for early PHP versions without JSON extension. It reimplements PHPs json_decode(). But there's also the upgrade.php.prefixed version, which you'd use here.
It introduces an additional flag JSON_PARSE_JAVASCRIPT.

up_json_decode($jsol, false, 512, JSON_PARSE_JAVASCRIPT);

And I totally forgot about mentionind this in the docs, but it also supports single-quoted strings.
For instance:

{ num: 123, "key": "value", 'single': 'with \' and unquoted " dbls' } 

Will decode into:

stdClass Object
(
    [num] => 123
    [key] => value
    [single] => with ' and unquoted " double quotes
)

Other options

  • JasonDecoder by @ArtisticPhoenix does support unquoted keys and literals, though no '-quoted strings. It's easy to understand or extend however.

  • YAML (1.2) is a superset of JSON, and most parsers support both unquoted keys or single-quoted strings. See also PHP YAML Parsers

Obviously any JSOL tokenizer/parser in userland is measurably slower than just preprocessing malformed JSON. If you expect no further gotchas from your webservice, go for the regex/quote conversion instead.

mario
  • 144,265
  • 20
  • 237
  • 291
3

One solution would be to build a proxy using NodeJS. NodeJS will handle the faulty JSON just fine and return a clean version:

johan:~ # node
> JSON.stringify(['foo', 'bar']);
'["foo","bar"]'

Maybe write a simple Node script that accepts the JSON data as STDIN and returns the validated JSON to STDOUT. That way you can call it from PHP.

The downside is that your server would need NodeJS. Not sure if that is a problem for you.

Johan
  • 1,958
  • 11
  • 20
2

If you know that PHP 5.5.+ will parse this JSON gracefully, I would pipe the web service responses trough a proxy script on a PHP5.5+ web server, which sanitizes the responses for lower versions - meaning just echo json_encode(json_decode($response)); That's a stable and reliable approach.

If you make the web service URL configurable trough a config value, it will work for lower versions by accessing the proxy, in higher versions by accessing the web service directly.

hek2mgl
  • 152,036
  • 28
  • 249
  • 266
1

A fast solution could be str_replace("'","\"",$string). This depends on many things, but I think you could give it a try.

Adrian Mare
  • 99
  • 1
  • 5
0

You could use (and probably modify/extend) a library to build an AST from the supplied JSON and replace the single quotes with double quotes.

https://github.com/Seldaek/jsonlint/blob/master/src/Seld/JsonLint/Lexer.php

Might be a good start.

Anthony Sterling
  • 2,451
  • 16
  • 10