44

How can I save a json-encoded string with international characters to the databse and then parse the decoded string in the browser?

<?php           
    $string = "très agréable";  
    // to the database 
    $j_encoded = json_encode(utf8_encode($string)); 
    // get from Database 
    $j_decoded = json_decode($j_encoded); 
?>    
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr" lang="fr">
    <?= $j_decoded ?>
</html> 
candlejack
  • 1,189
  • 2
  • 22
  • 51
FFish
  • 10,964
  • 34
  • 95
  • 136

10 Answers10

80

json utf8 encode and decode:

json_encode($data, JSON_UNESCAPED_UNICODE)

json_decode($json, false, 512, JSON_UNESCAPED_UNICODE)

force utf8 might be helpfull too: http://pastebin.com/2XKqYU49

Lukas Liesis
  • 24,652
  • 10
  • 111
  • 109
  • 5
    why giving minus? I had enough situations where this is THE only way that worked. Don't tell me all the stuff about encoding of file, database etc. There are situations you don't know your resource's encoding and it comes in random. Some utf8 some any other you can imagine. – Lukas Liesis Jul 28 '15 at 10:23
  • 1
    Only thing that worked for me, i'm building an API and i need to purely print the response as json with encoded chars – Ariel Jun 17 '16 at 20:02
  • Thanks Lukas, this was exactly what I was looking for. It converts encoding such as `\u00e9` into `é`. I just confirmed its usage in the [PHP docs (example 2)](http://php.net/manual/en/function.json-encode.php#example-4335). I am still curious though, is the `depth` parameter really useful? If the recursion is stopped at some depth, does it mean the json will not be fully en/decoded according to the bitmask? – CPHPython Aug 29 '16 at 18:11
  • See also [this answer](https://stackoverflow.com/a/45681613/287948), use `JSON_UNESCAPED_UNICODE|JSON_UNESCAPED_SLASHES` – Peter Krauss Oct 08 '18 at 08:00
29
  header('Content-Type: application/json; charset=utf-8');
Ahmet Erkan ÇELİK
  • 2,364
  • 1
  • 26
  • 28
29

This is an encoding issue. It looks like at some point, the data gets represented as ISO-8859-1.

Every part of your process needs to be UTF-8 encoded.

  • The database connection

  • The database tables

  • Your PHP file (if you are using special characters inside that file as shown in your example above)

  • The content-type headers that you output

Pekka
  • 442,112
  • 142
  • 972
  • 1,088
12

If your source-file is already utf8 then drop the utf8_* functions. php5 is storing strings as array of byte.

you should add a meta tag for encoding within the html AND you should add an http header which sets the transferencoding to utf-8.

<html>
<head>
   <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

and in php

<?php
header('Content-Type: text/html; charset=utf-8');
coding Bott
  • 4,287
  • 1
  • 27
  • 44
5
  1. utf8_decode $j_decoded = utf8_decode(json_decode($j_encoded)); EDIT or to be more correct $j_encoded = json_encode($j_encoded); $j_decoded = json_decode($j_encoded); no need for en/decoding utf8
  2. <meta charset="utf-8" />
teemitzitrone
  • 2,250
  • 17
  • 15
  • 1
    Why would one need `utf8_decode()` (which converts to ISO-8859-1) in a UTF-8 environment? – Pekka Nov 02 '10 at 11:15
  • ok, I see. I have to utf8_decode() as well.. Is there a difference doing utf8_decode(json_decode($j_encoded)) vs json_decode(utf8_decode($j_encoded))? – FFish Nov 02 '10 at 11:19
  • yes it is, and to be correct you shouldn't use `utf8_encode` anyway. but the way you used it is the poit. 1 you encode utf8 then json so to get your input you have to decode json and then utf8 – teemitzitrone Nov 02 '10 at 11:25
  • @Pekka if you mess up encoding anyway (see the `utf8_encode`) you have to correct it. ok, don't mess up the encoding is also a solution and i've edited my answer to reflect that – teemitzitrone Nov 02 '10 at 11:30
5

Try sending the UTF-8 Charset header:

<?php header ('Content-type: text/html; charset=utf-8'); ?>

And the HTML meta:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Coquevas
  • 609
  • 6
  • 11
3

For me both methods

<?php

header('Content-Type: text/html; charset=utf-8');

echo json_encode($YourData, \JSON_UNESCAPED_UNICODE);
rink.attendant.6
  • 44,500
  • 61
  • 101
  • 156
0

if you get "unexpected Character" error you should check if there is a BOM (Byte Order Marker saved into your utf-8 json. You can either remove the first character or save if without BOM.

Blox
  • 1
  • 1
0

Work for me :)

function jsonEncodeArray( $array ){
    array_walk_recursive( $array, function(&$item) { 
       $item = utf8_encode( $item ); 
    });
    return json_encode( $array );
}
Douglas Comim
  • 69
  • 1
  • 2
  • Thanks, this pointed the way for me, except it made things worse at first! :) then I realized that was because something had obviously been "over encoded" somewhere deeper in the stack, so - perhaps strangely - changing the `utf8_encode` to `utf8_decode` solved it. – randomsock Feb 16 '18 at 20:20
  • utf8_decode() is always the go to. Beyond that, your looking at mb_convert_encoding() or my preferred - iconv extension. – WiiLF Feb 08 '22 at 22:07
0

I had the same problem. It might differ depending on how You put the data to the db, but try what worked for me:

$str = json_encode($data);
$str = addslashes($str);

Do this before saving data to db.