67

I'm using json_encode($data) to an data array and there's a field contains Russian characters.

I used this mb_detect_encoding() to display what encoding it is for that field and it displays UTF-8.

I think the json encode failed due to some bad characters in it like "ра▒". I tried alot of things utf8_encode on the data and it will by pass that error but then the data doesn't look correct anymore.

What can be done with this issue?

thisiskelvin
  • 4,136
  • 1
  • 10
  • 17
sparkmix
  • 2,157
  • 3
  • 25
  • 33

9 Answers9

108

The issue happens if there are some non-utf8 characters inside even though most of them are utf8 chars. This will remove any non-utf8 characters and now it works.

$data['name'] = mb_convert_encoding($data['name'], 'UTF-8', 'UTF-8');
Stan Quinn
  • 473
  • 5
  • 12
sparkmix
  • 2,157
  • 3
  • 25
  • 33
  • 6
    You might want to add this as well `$mysqli->set_charset("utf8");` – Justin Joy Aug 15 '20 at 19:44
  • I've tried to find that invalid string by adding the following code: ` foreach ($addresses as $address) { $converted = mb_convert_encoding($address, 'UTF-8', 'UTF-8'); if ($converted !== $address) { dd($addresses); } }` Two points: 1. The `$converted !== $address` condition is never met. I suppose this is because `===` is a "binary-safe" operator… 2. I don't get error in the end, even though I never assign `$converted` to anything! It's like `mb_convert_encoding()` accepted string by reference, although it's not… – pilat Apr 27 '21 at 18:19
  • Funny also because of the bad encoding, both functions `mb_check_encoding()` and `json_decode()` didn't work properly.. And `mb_detect_encoding()` did..It was a problem with bad UTF-8 encoded file.. But after `mb_convert_encoding()`, everything worked as expected.. – Dan Oct 04 '22 at 22:31
49

If you have a multidimensional array to encode in JSON format then you can use below function:

If JSON_ERROR_UTF8 occurred :

$encoded = json_encode( utf8ize( $responseForJS ) );

Below function is used to encode Array data recursively

/* Use it for json_encode some corrupt UTF-8 chars
 * useful for = malformed utf-8 characters possibly incorrectly encoded by json_encode
 */
function utf8ize( $mixed ) {
    if (is_array($mixed)) {
        foreach ($mixed as $key => $value) {
            $mixed[$key] = utf8ize($value);
        }
    } elseif (is_string($mixed)) {
        return mb_convert_encoding($mixed, "UTF-8", "UTF-8");
    }
    return $mixed;
}
Irshad Khan
  • 5,670
  • 2
  • 44
  • 39
  • 9
    `mb_convert_encoding` does the recursive work itself, as you can see in the documentation [link](https://www.php.net/manual/en/function.mb-convert-encoding.php): _If val is an array, all its string values will be converted recursively._ So the function `utf8ize` is not needed. All you need would be `json_encode(mb_convert_encoding($responseForJS, "UTF-8", "UTF-8"));` – elnezah Feb 04 '20 at 16:19
  • 6
    mb_convert_encoding is only able to convert arrays if you are running PHP 7.2 or above, just for clarification. Otherwise, this function works perfectly. – mylesmg Feb 15 '20 at 22:42
30

With php 7.2, two options allow to manage invalid UTF-8 direcly in json_encode :

https://www.php.net/manual/en/function.json-encode

json_encode($text, JSON_INVALID_UTF8_IGNORE);

Or

json_encode($text, JSON_INVALID_UTF8_SUBSTITUTE);
hugsbrugs
  • 3,501
  • 2
  • 29
  • 36
  • thanks, It works for me because my response in api has emoji in title string, but i have one confusion, that i have read somewhere that emoji is utf-8 character then why emoji in string gives this malformed utf-8 characters error? – Haritsinh Gohil Sep 10 '21 at 13:15
  • 1
    @HaritsinhGohil perhaps some emojis are valid UTF-8 chars and others are not ... – hugsbrugs Sep 13 '21 at 12:39
27

Please, make sure to initiate your Pdo object with the charset iso as utf8. This should fix this problem avoiding any re-utf8izing dance.

$pdo = new PDO("mysql:host=localhost;dbname=mybase;charset=utf8", 'user', 'password');
Tom Ah
  • 555
  • 6
  • 6
  • This solved my situation. It also works for other connection types, like dlib for MSSQL Server. – Alexandru Topală Apr 03 '19 at 09:11
  • Was given an old project to fix encoding issues and this helped me a lot. Only difference is that this project was using ADO and solution was a little bit different, solved it by using setCharset(), info here http://adodb.org/dokuwiki/doku.php?id=v5:reference:connection:setcharset – yurguis Jul 23 '20 at 16:20
9

you just add in your pdo connection charset=utf8 like below line of pdo connection:

$pdo = new PDO("mysql:host=localhost;dbname=mybase;charset=utf8", 'user', 'password');

hope this will help you

M.Bilal Murtaza
  • 1,757
  • 2
  • 10
  • 8
3

Remove HTML entities before JSON encoding. I used html_entity_decode() in PHP and the problem was solved

$json = html_entity_decode($source);
$data = json_decode($json,true);
Adam Michalik
  • 9,678
  • 13
  • 71
  • 102
0

Do you by any chance have UUIDs in your result set? In that case the following database flag will help:

PDO::DBLIB_ATTR_STRINGIFY_UNIQUEIDENTIFIER => true
Kees de Kooter
  • 7,078
  • 5
  • 38
  • 45
0

If your data is well encoded in the database for example, make sure to use the mb_ * functions for string handling, before json_encode. Functions like substr or strlen do not work well with utf8mb4 and can cut your text and leave a malformed UTF8

-1

I know this is kind of an old topic, but for me it was what I needed. I just needed to modify the answer 'jayashan perera'.

//...code
        $stmt->execute();
        $result = $stmt->fetchAll(PDO::FETCH_ASSOC);


        for ($i=0; $i < sizeof($result) ; $i++) { 
            $tempCnpj = $result[$i]['CNPJ'];
            $tempFornecedor = json_encode(html_entity_decode($result[$i]['Nome_fornecedor']),true) ;
            $tempData = $result[$i]['efetivado_data'];
            $tempNota = $result[$i]['valor_nota'];
            $arrResposta[$i] = ["Status"=>"true", "Cnpj"=>"$tempCnpj", "Fornecedor"=>$tempFornecedor, "Data"=>"$tempData", "Nota"=>"$tempNota" ];
        }

        echo json_encode($arrResposta);

And no .js i have use

obj = JSON.parse(msg);