0

In my PHP webservice I have to return some arrays containing strings with accented letters in JSON format (I parse them in my client app with JS).

But the htmlentities function replaces the strings containing accented letters with empty strings! If I don't use htmlentities, I get an error on json_encode.

$a = array('key' => 'èàìòù');
foreach ($a as $k => $v)
    $a[$k] = htmlentities($v, ENT_QUOTES | ENT_HTML5, 'UTF-8');

json_encode($a);

I also tried the ENT_COMPAT option, but I get only empty values.

Note that if I print my array with print_r before htmlentities, the content is ok.

More information:

Every file is in UTF-8 w/o BOM format (saved with Notepad++). I already tried to add the headers to set UTF-8 in my PHP and HTML. The DB table is in utf8_unicode_ci format.

Versions: PHP 5.3.3, MySQL 5.5.30-1.

Thank you for your help!

  • What about _not_ using `htmlentities`? I believe `é` is converted to `\u00e9` by `json_encode`, why not use that? – Elias Van Ootegem Jul 03 '13 at 13:46
  • Those should all get converted automatically in `json_encode`. Proof: http://phpfiddle.org/main/code/p0t-74w – Rob W Jul 03 '13 at 13:49
  • 1
    To be perfectly clear: there is no need to run `htmlentities()` on data before `json_encode()` ing it. – Pekka Jul 03 '13 at 13:55
  • `$a = array('key' => utf8_encode('èàìòù')); $t= json_encode($a); echo "
    ";
    print_r(json_decode($t)); echo "
    ";` Please see the 3rd answer below accepted answer of the question http://stackoverflow.com/questions/6928982/how-to-json-encode-array-with-french-accents given by @sdespont
    – Yogesh Jul 03 '13 at 14:07
  • What happens if you just use $a[$k] = htmlentities($v); ? – Bjorn 'Bjeaurn' S Jul 03 '13 at 13:48
  • I get an error encoding JSON, because the "è" character is converted to "è". If i don't use `htmlentities` at all, the "è" character is converted to "\u00e8" and I get an error again. I use jQuery with the $.getJSON method. – Marco Malentacchi Jul 03 '13 at 15:32

1 Answers1

2

You're not specifying which error you are getting in json_encode(), but my suspicion is that your incoming data is not UTF-8 encoded, which will break both the unnecessary htmlentities() call (which you are instructing to expect UTF-8 data) and json_encode() (because that function requires all input data to be UTF-8).

Make sure the data you are encoding is actually UTF-8 encoded. You won't be needing htmlentites() at all then.

You're not saying where the data comes from, but if it's from a database, the connection sometimes needs to be explicitly set to UTF-8. See UTF-8 all the way through

Community
  • 1
  • 1
Pekka
  • 442,112
  • 142
  • 972
  • 1,088