0

I receive some generated json files and some files would contain the ™ symbol, if the file has that then json_decode won't work on it, when I print $data NULL will be printed. If I manually remove the symbol then I will see the data. I am using the below code and it will print out what is in the json file until it gets to the json file that has the ™ symbol

$json = file_get_contents($count . '.json');
$data = json_decode($json);
echo '<pre>';
var_dump($data);
echo '</pre>';

I have tried using urlencode and urldecodeand htmlspecialchars but they don't work either.

jamie12
  • 11
  • 3
  • Is $json coded as UTF-8? – Wiimm Jan 28 '19 at 19:42
  • Can you provide an example JSON string that won't decode? – S. Imp Jan 28 '19 at 19:46
  • This question looks like maybe a duplicate of [this question](https://stackoverflow.com/questions/34100373/json-decode-php-manage-special-characters-tm-symbol) -- although the answer in the other question doesn't look very good either. – S. Imp Jan 28 '19 at 19:48
  • From what I can see in the files it is 90% English with only the following symbols in the odd file: `é`, `™`, `ï`, `ç`, `í`, `ö`, `à`, `ñ`. I cannot change anything about how I receive the file – jamie12 Jan 28 '19 at 20:36
  • This link may be helpful: [https://stackoverflow.com/questions/4663743/how-to-keep-json-encode-from-dropping-strings-with-invalid-characters] – venom_1979 Jan 28 '19 at 20:55

1 Answers1

0

json_decode will only parse UTF-8 strings. If the file you are reading is not UTF-8 format it will fail.

If you do not know the encoding of the file you will be reading, there are ways to convert the data to UTF-8 before parsing it as shown in this post:

PHP: Convert any string to UTF-8 without knowing the original character set, or at least try

sirlanceoflompoc
  • 941
  • 9
  • 11