1

I work on a system that automates signature generation for outlook. The part to generate the .htm files works great. But now I need to also add files in .txt format. If I use the content without any change in the encoding, all my accentuated characters are converted to a different value for example : "é" becomes "é" or "ô" becomes "ô".

This issue clearly looked like an encoding conflict of some sort. I tried to correct it by converting the text value input to the "Windows-1252" encoding.

$myText = iconv( mb_detect_encoding( $myText ) , "Windows-1252//TRANSLIT", $myText);

But it didn't change anything. I also tried with :

$myText = mb_convert_encoding($myText, "Windows-1252");

And it didn't work either. For both of these tests, I checked the file type with Atom (my IDE) and it recognise these files as UTF-8. But when I check on terminal with file -I signature.txt it responds with this encoding signature.txt: text/plain; charset=iso-8859-1

Note that if I manually change the encoding to Windows-1252 in Atom, the characters are correct.

Has anyone met the same problem ? Is there another way in php to specify the encoding of the file ?

  • 1
    Not sure if this is still the case, but in the past `mb_detect_encoding` wouldn't detect text as UTF-8 unless you passed it the strict option. – Powerlord Nov 15 '19 at 15:20
  • @Powerlord I forgot to mention that I also tried `$myText = mb_convert_encoding($myText, "Windows-1252", "UTF-8");` – Loïc Van Gaver Nov 15 '19 at 15:45

1 Answers1

0

I figured it out. The code to use was (as pointed out by @Powerlord):

$monTexteTXT = mb_convert_encoding($monTexteTXT, "Windows-1252", "UTF-8");

I had a false negative when I first tried this solution because when I opened the file the characters seemed broken. But once it was opened with outlook it was fine.