0

My file is encoded in UTF-8, all characters work fine it except "é"

I have tried several solutions but I always get this character � instead (and only for few words) :

Here is an example of the few option I have tried :

$auteur= utf8_decode("Aurélie ABBĀSĀBĀD");  
$commAuteur = mb_convert_encoding($auteur, "HTML-ENTITIES", 'UTF-8');

I added utf8_decode to be sure I'm going from ISO-8859-1 to utf-8, then I tried also this :

$commAuteur  = iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $auteur);

I eventually also tried this :

 $commAuteur = htmlentities($auteur);

None of all these options worked correctly, and I don't understand how some é are displayed normally and others have this issue, im out of solutions any suggestions ?

Jihane
  • 135
  • 12
  • is this database-related by any chance? – Funk Forty Niner May 04 '18 at 14:08
  • You don't need to do anything. Just save your source code file as UTF-8 and `echo` the string. `utf8_decode` converts *from ISO-88591 to UTF-8*, the opposite of what you expect. – deceze May 04 '18 at 14:09
  • what is the file's encoding also? – Funk Forty Niner May 04 '18 at 14:09
  • This may help: [Handling Unicode Front To Back In A Web App](http://kunststube.net/frontback/) – deceze May 04 '18 at 14:10
  • the file is encoded in UTF-8 and I did try not doing anything but it doesn't work that's why i started looking for functions that could solve the issue – Jihane May 04 '18 at 14:11
  • UTF-8 has two modes; one with and one without a BOM (byte order mark) which will make a difference. – Funk Forty Niner May 04 '18 at 14:11
  • The one i have is without the BOM – Jihane May 04 '18 at 14:12
  • In a nutshell: `$auteur` is ISO-8859 encoded (because of `utf8_decode`), yet the code always seems to treat it as if it is UTF-8 encoded. That's the entire issue. You just need to get rid of all those functions it's calling. – deceze May 04 '18 at 14:13
  • actually all those functions were placed to test why for some word the é caracter is displayed and for others it's not, when I did just utf8_decode i noticed that the é is translated to "?" for some words and for others to � how is that possible ? – Jihane May 04 '18 at 14:23
  • What exactly is the difference between *some of those words and others*? Are they in different files? Do they come from different sources (e.g. database)? – deceze May 04 '18 at 14:28
  • That's my problem they are in the same array and they come from the same database here is an example : Aurélie becomes : Aur� lie and jérémy becomes j?r?my – Jihane May 04 '18 at 14:32
  • Then you need to go into more detail here. With encoding the details matter a lot; it's a big difference asking about a string literal in a file and data from a database. General hint: use `echo bin2hex($str)` to *see* what bytes your string actually consists of, which allows you to identify what *encoding* it's actually in, which allows you to debug what's going on. – deceze May 04 '18 at 14:41
  • when I tried bin2hex($str) i got **c3a9** for the é of jérémy et **3f6c** for the é of Aurélie so the é is not translated in the same way – Jihane May 07 '18 at 08:47

0 Answers0