4

I'm trying to find a JSON string inside a string, with PHP.

So if a string is entirely JSON, PHP can parse it like this:

<?php
$json = '{"a":1,"b":2,"c":3,"d":4,"e":5}';

var_dump(json_decode($json));
var_dump(json_decode($json, true));

?>

But what if I have a string like

$str = 'I have a string that contains JSON like this : {"a":1,"b":2,"c":3,"d":4,"e":5} and then string continues';

How can I parse JSON from this?

Thanks !

Edit:

Thanks for all your answers. They really helped me. And I should add that in my case, the string will be in this form:

$str = 'some string and some more string [[delimiter]] json={"a":1,"b":2}';

and, I'm not the downvoter :)

jeff
  • 13,055
  • 29
  • 78
  • 136
  • Well, you could use Regex? – naththedeveloper Jul 12 '13 at 14:36
  • preg_match the [] or {} pair? – Waygood Jul 12 '13 at 14:36
  • 3
    You shouldn't want it. You are doing something wrong. Have your JSON already separated from whatever strings. – Your Common Sense Jul 12 '13 at 14:37
  • 2
    Why do you have such a string in the first place? What's the use-case? – Ja͢ck Jul 12 '13 at 14:44
  • Your only real chance is to write a custom JSON parser from scratch, which looks for the first valid token in the string, tries to decode as much as possible and quietly continues on syntax errors. Regexen aren't gonna cut it, you need a state machine here. You should really avoid going there unless you really can't help it. – deceze Jul 12 '13 at 14:55
  • I use wordpress as CMS, and I want to store some data about the post, in the content (but not serve it obviously). I know it is probably wrong but I don't want to use the meta table, i.e. I want to store all the data in one place. – jeff Jul 12 '13 at 14:55
  • Just use the effing meta table, that's what it's for! ;-) – deceze Jul 12 '13 at 14:57
  • @deceze :) you are probably right, but I really like to be able to migrate my site, and I want to see all the posts as objects, that encapsulate all the relevant data. – jeff Jul 12 '13 at 15:00
  • @deceze Oh c'mon, you know that PCRE can do state machines ;-) – Ja͢ck Jul 12 '13 at 15:13

3 Answers3

2

You will want a serious regular expression for this, such as the one here, which I've made very slight changes to for matching as substrings:

$str = 'I have a string [123,456] that contains JSON like this : {"a":1,"b":2,"c":3,"d":4,"e":5} and then string continues';

$pcre_regex = '
  /
  (?(DEFINE)
     (?<number>   -? (?= [1-9]|0(?!\d) ) \d+ (\.\d+)? ([eE] [+-]? \d+)? )
     (?<boolean>   true | false | null )
     (?<string>    " ([^"\\\\]* | \\\\ ["\\\\bfnrt\/] | \\\\ u [0-9a-f]{4} )* " )
     (?<array>     \[  (?:  (?&json)  (?: , (?&json)  )*  )?  \s* \] )
     (?<pair>      \s* (?&string) \s* : (?&json)  )
     (?<object>    \{  (?:  (?&pair)  (?: , (?&pair)  )*  )?  \s* \} )
     (?<json>   \s* (?: (?&number) | (?&boolean) | (?&string) | (?&array) | (?&object) ) \s* )
  )
  (?&json)
  /six
';

if (preg_match_all($pcre_regex, $str, $matches)) {
    print_r($matches[0]);
}

Returns:

Array
(
    [0] =>  [123,456] 
    [1] =>  {"a":1,"b":2,"c":3,"d":4,"e":5} 
)

Update

You can add anchors in the expression to match, e.g.:

json=(?<expr>(?&json))\Z
Community
  • 1
  • 1
Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
1
preg_match('/(\{.+\})/', $str, $result);
echo $result[0];

Should do it if the rest of the string doesn't contain curly braces.

Novocaine
  • 4,692
  • 4
  • 44
  • 66
1

You should create your own special delimiters around the JSON that you put in the string. If you really can't do that, you can try looking between '{"' and '}' I suppose, but it won't work if those are elsewhere in your string. You can do it with this custom function :

function get_string_between($string, $start, $end){
    $string = " ".$string;
    $ini = strpos($string,$start);
    if ($ini == 0) return "";
    $ini += strlen($start);
    $len = strpos($string,$end,$ini) - $ini;
    return substr($string,$ini,$len);
}

$fullstring = 'I have a string that contains JSON like this : {"a":1,"b":2,"c":3,"d":4,"e":5} and then string continues';
$parsed = get_string_between($fullstring, '{"', '}';

echo $parsed;
Dany Caissy
  • 3,176
  • 15
  • 21
  • Yes that's what I was doing. And the splitted part will only contain {}'s once for JSON, so your solution will work. Thanks ! – jeff Jul 12 '13 at 14:43
  • `if ($ini == 0) return "";`; so, if `{"` is found at the beginning of the string you return an empty string? That can't be right! – Ja͢ck Jul 12 '13 at 15:04
  • Also, this is basically `/\{".+?\}/` which is actually easier to read :) – Ja͢ck Jul 12 '13 at 15:10
  • @Jack The first character of the string is always a space, look at the first line inside the function.. – Dany Caissy Jul 12 '13 at 15:15
  • 1
    Which brings me to my other point; the one where a regular expression is actually easier to read than this code :) – Ja͢ck Jul 12 '13 at 15:16
  • No it's not, regular expressions are not easy to read and they aren't fast either. Also with this function, you can just specify the delimiters and get your result without the need to write a regular expression for each different input. – Dany Caissy Jul 12 '13 at 15:21
  • Your argument about performance is irrelevant, I hope you'd agree on that. Besides, you can easily generate such a generic expression instead of writing them each time. – Ja͢ck Jul 12 '13 at 15:30
  • Yes performance is insignificant in this case. But well, we don't need to argue, visitors will have the two solutions to chose from. – Dany Caissy Jul 12 '13 at 15:44