0

I have some JSON that I got through an API call, and I run json_decode on it, grab an array from it, then re-encode it with json_encode. The result however is not the same JSON; it's messing up with the URLs. How do I make it encode properly?

original

{"created_at":"Mon, 19 Mar 2012 01:34:41 +0000","entities":{"hashtags":[{"text":"stanford","indices":[23,32]}],"urls":[{"url":"http:\/\/t.co\/Of4z6jKG","expanded_url":"http:\/\/360.io\/5sZc2T","display_url":"360.io\/5sZc2T","indices":[33,53]}],"user_mentions":[]},"from_user":"rayfk","from_user_id":335143881,"from_user_id_str":"335143881","from_user_name":"Raymond Kennedy","geo":{"coordinates":[37.4227,-122.1753],"type":"Point"},"id":181554251733020673,"id_str":"181554251733020673","iso_language_code":"en","metadata":{"result_type":"recent"},"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/1468102095\/image_normal.jpg","profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/1468102095\/image_normal.jpg","source":"<a href="http:\/\/www.occipital.com\/360\/app" rel="nofollow">360 Panorama<\/a>","text":"View from mid lake log #stanford http:\/\/t.co\/Of4z6jKG","to_user":null,"to_user_id":null,"to_user_id_str":null,"to_user_name":null}

after decode/encode combo

{"created_at":"Mon, 19 Mar 2012 01:34:41 +0000","entities":{"hashtags":[{"text":"stanford","indices":[23,32]}],"urls":[{"url":"http:\/\/t.co\/Of4z6jKG","expanded_url":"http:\/\/360.io\/5sZc2T","display_url":"360.io\/5sZc2T","indices":[33,53]}],"user_mentions":[]},"from_user":"rayfk","from_user_id":335143881,"from_user_id_str":"335143881","from_user_name":"Raymond Kennedy","geo":{"coordinates":[37.4227,-122.1753],"type":"Point"},"id":181554251733020673,"id_str":"181554251733020673","iso_language_code":"en","metadata":{"result_type":"recent"},"profile_image_url":"http:\/\/a0.twimg.com\/profile_images\/1468102095\/image_normal.jpg","profile_image_url_https":"https:\/\/si0.twimg.com\/profile_images\/1468102095\/image_normal.jpg","source":"<a href="http:\/\/www.occipital.com\/360\/app" rel="nofollow">360 Panorama<\/a>","text":"View from mid lake log #stanford http:\/\/t.co\/Of4z6jKG","to_user":null,"to_user_id":null,"to_user_id_str":null,"to_user_name":null}

Those are the full snippets, but the culprit is this:

original "source":"&lt;a href=&quot;http:\/\/www.occipital.com\/360\/app&quot; rel=&quot;nofollow&quot;&gt;360 Panorama&lt;\/a&gt;"

after "source":"<a href="http:\/\/www.occipital.com\/360\/app" rel="nofollow">360 Panorama<\/a>"

Tony Stark
  • 24,588
  • 41
  • 96
  • 113

2 Answers2

0

I'm not sure what is causing it but you can correct it by applying the html_entity_decode() function to the after version. This will change things such as &lt; or &quot; back to their original form.

Depending on how it affects your quoting, there are a few flags you can pass it as well to get the result you need.

  • ENT_COMPAT: Will convert double-quotes and leave single-quotes alone.
  • ENT_QUOTES: Will convert both double and single quotes.
  • ENT_NOQUOTES: Will leave both double and single quotes unconverted.

[EDIT]

Run your broken JSON through this function:

function fixDoubleQuotedJSON($broken_json)
{
   return str_replace('"','\\"',$broken_json);
}
Jeremy Harris
  • 24,318
  • 13
  • 79
  • 133
  • so actually it's the other way around... the tool i use accepts the encoded HTML and rejects the decoded. does JSON require encoded HTML or is this tool just lame? – Tony Stark Mar 19 '12 at 08:02
  • http://jsonformatter.curiousconcept.com/#jsonformatter this tool also says the after version is bad, on that same line. – Tony Stark Mar 19 '12 at 08:03
  • No, JSON doesn't have to have htmlentities() applied. Although, to avoid problems, quotes should be escaped. There is a known problem with double quotes being escaped by json_encode() and then not being able to be read by jQuery's .parseJSON() but there are workarounds if you google for them. – Jeremy Harris Mar 19 '12 at 08:07
  • so why do both of those json viewers complain on that line? – Tony Stark Mar 19 '12 at 08:08
  • oh, nevermind, i see why you gave me the stuff on quotes now :) – Tony Stark Mar 19 '12 at 08:12
  • Running your **after** JSON through that tool shows that you have too many double quotes. The ones inside the url should be escaped. Here's a related SO question: http://stackoverflow.com/questions/949604/json-parse-error-with-double-quotes – Jeremy Harris Mar 19 '12 at 08:13
  • right, but the issue is that either decode or encode is making the html special characters into quotes again. would be nice if they just left the special chars alone. – Tony Stark Mar 19 '12 at 08:15
0

http://codepad.org/DMkAS2iR

They seem to be equal, except at an unexcepted place:

before: "id":181554251733020673,"id_str"
after:  "id":181554251733020000,"id_str"

Those id's won't really match after "lossless" json transforms, and JSON_BIGINT_AS_STRING option is supported from PHP 5.4.

Codepad's php version is 5.2.5 by the way;

biziclop
  • 14,466
  • 3
  • 49
  • 65
  • i wonder if it has to do with the version, mine is 5.2.17, and either the decode or encode ruins the special HTML characters and makes them into quotes again, which is breaking it. – Tony Stark Mar 19 '12 at 08:26
  • also I just confirmed that it's the encode that screws it up. if i echo right after the decode, the string matches the original. – Tony Stark Mar 19 '12 at 08:33