30

Quest

I am looking to fetch rows that have accented characters. The encoding for the column (NAME) is latin1_swedish_ci.

The Code

The following query returns Abord â Plouffe using phpMyAdmin:

SELECT C.NAME FROM CITY C
WHERE C.REGION_ID=10 AND C.NAME_LOWERCASE LIKE '%abor%'
ORDER BY C.NAME LIMIT 30

The following displays expected values (function is called db_fetch_all( $result )):

  while( $row = mysql_fetch_assoc( $result ) ) {
    foreach( $row as $value ) {
      echo $value . " ";
      $value = utf8_encode( $value );
      echo $value . " ";
    }

    $r[] = $row;
  }

The displayed values: 5482 5482 Abord â Plouffe Abord â Plouffe

The array is then encoded using json_encode:

$rows = db_fetch_all( $result );
echo json_encode( $rows );

Problem

The web browser receives the following value:

{"ID":"5482","NAME":null}

Instead of:

{"ID":"5482","NAME":"Abord â Plouffe"}

(Or the encoded equivalent.)

Question

The documentation states that json_encode() works on UTF-8. I can see the values being encoded from LATIN1 to UTF-8. After the call to json_encode(), however, the value becomes null.

How do I make json_encode() encode the UTF-8 values properly?

One possible solution is to use the Zend Framework, but I'd rather not if it can be avoided.

Dave Jarvis
  • 30,436
  • 41
  • 178
  • 315

6 Answers6

43
// Create an empty array for the encoded resultset
$rows = array();

// Loop over the db resultset and put encoded values into $rows
while($row = mysql_fetch_assoc($result)) {
  $rows[] = array_map('utf8_encode', $row);
}

// Output $rows
echo json_encode($rows);
Kemo
  • 6,942
  • 3
  • 32
  • 39
12
foreach( $row as $value ) {
  $value = utf8_encode( $value );

You're not actually writing your encoded value back into the $row array there, you're only changing the local variable $value. If you want to write back when you change the variable, you would need to treat it as a reference:

foreach( $row as &$value ) {

Personally I would try to avoid references where possible, and for this case instead use array_map as posted by Kemo.

Or mysql_set_charset to UTF-8 to get the return values in UTF-8 regardless of the actual table collations, as a first step towards migrating the app to UTF-8.

bobince
  • 528,062
  • 107
  • 651
  • 834
5

My solution is insert this line mysql_query('SET CHARACTER SET utf8');, before the SELECT. This method is good.

bookcasey
  • 39,223
  • 13
  • 77
  • 94
jailsonjan
  • 51
  • 1
  • 1
4

It seems that rather than putting it in a query, one should put:

mysql_set_charset('utf8');

after the mysql connect statement.

Robert Imhoff
  • 471
  • 4
  • 6
1

In your connect: mysql_set_charset('utf8', $link);

Example:

<?php
$link = mysql_connect('localhost', 'your_user', 'your_password');
mysql_set_charset('utf8', $link);
$db_selected = mysql_select_db('your_db', $link);
...

Hope that helps somewhat.

alditis
  • 4,633
  • 3
  • 49
  • 76
1

Try iconv_r($row,"LATIN1","UTF-8//TRANSLIT"); (function below) before you json_encode() your result.

I have UTF-8 as the table encoding and as the result set encoding, but sometimes folks still manage to submit non-UTF-8 characters via forms and it's troublesome to track down every single input source so I have also wrapped json_encode() to make it safer. In particular I've had it NULL strings on me containing the degree symbol and "smart quotes" which folks in the UK seem so fond of.

function safe_json_encode($mixed,$missing="TRANSLIT"){
   $out=json_encode($mixed);
   if ($err=  json_last_error()){
      iconv_r("UTF-8","UTF-8//$missing",$mixed);
      $out=json_encode($mixed);
   }
   return $out;
}
function iconv_r($charset_i, $charset_o, &$mixed) {
   if (is_string($mixed)) {
      $mixed = iconv($charset_i, $charset_o, $mixed);
   } else {
      if (is_object($mixed)){
         $mixed = (array) $mixed;
      }
      if (is_array($mixed)){
         foreach ($mixed as $key => &$value) {
            iconv_r($charset_i, $charset_o, $value);
         }
      }
   }
}
Wil
  • 757
  • 9
  • 12