In your case you could exploit the fact that strings in json could not be over a line. That is a snappy point to grab with s multi-line aware search and replace with a regular expression function like preg_match_callback
in PHP.
/^\s+"[a-z_"]+": "([^"]*".*)",?$/mi
Whitespace at the beginning of the line; member-name in form of a valid name (only characters and underscore here) as a string; the :
and then the broken string until the end of the line optionally followed by a comma ,?
.
This regex already matches only invalid lines. However if your json also contains a valid string with \"
inside, this regex does not really work.
So it's also good to place some checks that the replacement would do what it is intended.
$like = '... json-like but broken json string as in question ...';
// Fixing #1: member strings containing double-quotes on the same line.
$fix1Pattern = '/^(\s+"[a-z_]+": ")([^"]*".*)(",?)$/mi';
$fix1Callback = function ($matches) {
list($full, $prefix, $string, $postfix) = $matches;
$fixed = strtr($string, ['"' => '\"']);
if (!is_string(json_decode("\"$fixed\""))) {
throw new Exception('Fix #1 did not work as intended');
}
return "$prefix$fixed$postfix";
};
// apply fix1 onto the string
$buffer = preg_replace_callback($fix1Pattern, $fix1Callback, $like);
// test if it finally works
print_r(json_decode($buffer));
Keep in mind that this is limited. You might need to learn about regular expressions first which is a world of it's own. But the principle is often very similar: You search the string for the patterns that are the broken parts and then you do some string manipulation to fix these.
If the json string is much more broken, then this needs even more love, probably not to be easily solved with a regular expression alone.
Exemplary output for the code-example and the data provided:
stdClass Object
(
[d] => stdClass Object
(
[results] => Array
(
[0] => stdClass Object
(
[__metadata] => stdClass Object
(
[uri] => https://api.datamarket.azure.com/Data.ashx/Bing/Search/Web?Query=u0027non supporting iframesu0027&Market=u0027it-ITu0027&Adult=u0027Offu0027&Options=u0027DisableLocationDetectionu0027&WebSearchOptions=u0027DisableQueryAlterationsu0027&$skip=0&$top=1
[type] => WebResult
)
[ID] => 7858fc9f-6bd5-4102-a835-0fa89e9f992a
[Title] => something good
[Description] => something "WRONG" here!
[DisplayUrl] => www.devx.com/Java/Article/27685/1954
[Url] => http://www.devx.com/Java/Article/27685/1954
)
)
[__next] => https://api.datamarket.azure.com/Data.ashx/Bing/Search/Web?Query=u0027non%20supporting%20iframesu0027&Market=u0027it-ITu0027&Adult=u0027Offu0027&Options=u0027DisableLocationDetectionu0027&WebSearchOptions=u0027DisableQueryAlterationsu0027&$skip=50
)
)