1

I'm using GoogleMaps API to retrieve location information. The result is fetched via cURL and the fetched string should be converted to a JSON-object using json_decode.

For many locations (in for example The Netherlands) this works like a charm. But for many German (and probably more countries like Austria, Swiss etc) this doesn't work as expected.

I believe this is because of the 'special' characters like ß, but also ü, ë, ä, ï and so on.

For example: this is the string fetched via cURL (http://maps.googleapis.com/maps/api/geocode/json?address=Stoltenkampstra%C3%9Fe%2011,Bad%20Bentheim&sensor=false&language=nl)

In the following $sResponse is the result fetched by cURL. When I try to perform json_decode($sResponse); its value becomes null. When I perform json_last_error() it says 5 (which means JSON_ERROR_UTF8). When I perform mb_detect_encoding($sResponse) it says UTF-8.

Any suggestions?

Ben Fransen
  • 10,884
  • 18
  • 76
  • 129
  • It works here with a simple `var_dump(json_decode(file_get_contents("http://maps.googleapis.com/maps/api/geocode/json?address=Stoltenkampstra%C3%9Fe%2011,Bad%20Bentheim&sensor=false&language=nl"));`. – Wrikken Jun 27 '13 at 18:11
  • Show us what the response is exactly, what exactly you're trying to decode. `var_dump($sResponse)`. – deceze Jun 27 '13 at 18:12
  • 1
    Argh, thanks Wrikken! I've tried your link and I saw malformed characters as well. I forgot to tell the document what its charset should be using a meta-tag. I'm sorry! :) – Ben Fransen Jun 27 '13 at 18:17

1 Answers1

3

If you encounter this problem as well, make sure you've set your document to have to correct charset. In my case I forgot to include <meta charset='utf-8'> in my index.php-file. To me this was what I overlooked... Dumb... but maybe it helps you in the future ;)

As correctly mentioned by Gumbo, this wasn't the only fix to the problem. (It only fixed how the data was presented in my browser). I was also playing with the Encoding-library, using Encoding::toUTF8(). This is a very neat and helpful class I've found during my search for a solution. You can read about it here: Detect encoding and make everything UTF-8

Community
  • 1
  • 1
Ben Fransen
  • 10,884
  • 18
  • 76
  • 129
  • 1
    That couldn’t have fixed the problem: JSON isn’t HTML. – Gumbo Jun 27 '13 at 18:23
  • As a matter of fact, you are right. I've also added another thing. I've updated my answer. Thanks for being sharp ;) +1 – Ben Fransen Jun 27 '13 at 18:38
  • 1
    Even though that class gets a lot of praise, it's virtually pointless. You cannot work with encodings based on guesses, it's simply not possible by definition. You just need to be aware of what encoding what is in when; not guess and try around. See [What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text](http://kunststube.net/encoding/). It's not that hard. – deceze Jun 27 '13 at 18:49
  • @deceze, I've just spent 50 minutes of reading your article. Its a very complete explanation on the topic. +1! Two questions: 1: are you saying UTF-16 is the way to store data in a database (according to: Unicode To The Confusion where you say 'UTF-16 is in the middle, using at least two bytes, growing to up to four bytes as necessary.'). The project I'm working on is going to store data with characters from all over the world. 2: Why does `json_decode()` fails while the encoding of the string (Google Maps API location response) is in UTF-8. – Ben Fransen Jun 29 '13 at 09:25
  • 1) I didn't really say it's *the* way, but yes, for large amounts of "non English" text it *can* be more efficient than UTF-8. At least if you're using a lot of Asian languages. I'd run some tests and calculations to see whether it's the best for *your* use case or not. 2) It shouldn't. I'd like to see a more conclusive demo before saying anything else. – deceze Jun 29 '13 at 17:55