0

The print_r() shows the chinese characters correctly, but once it's written to the file and I try to open it in Notepad, Wordpad and Microsoft Word it shows up as 美元共付é¡Â計劃

        print_r($contents);
        exit;
        if (file_put_contents($this->outPath.$dataFile, $contents) === false) {
            $this->_throwError('Error: Unable to write data file: "'.$this->outPath.$dataFile.'"');
        }

I've also tried this without luck:

        $handle = fopen($this->outPath.$dataFile, "w+b");  //binary mode prevents any conversion
        fwrite($handle, $contents);
        fclose($handle);

The text is valid UTF-8 so I'm unsure what could be the problem

McDowell
  • 107,573
  • 31
  • 204
  • 267
Ben
  • 60,438
  • 111
  • 314
  • 488

1 Answers1

2

Simple: Notepad, Wordpad and Word are all Microsoft products which notoriously suck with encodings expect UTF-8 encoded files to begin with a BOM. If you don't include the BOM, they do not recognize the encoding correctly and misinterpret the file.

It is not typically recommended to add a BOM to UTF-8 encoded files, only the Microsoft universe seems to require/prefer it. You need to decide which is worse: possibly introduce problems by including a BOM or having to somehow manually open the file in Word/Note(-pad) with the right encoding.

A BOM in PHP is simply this:

$bom = "\xEF\xBB\xBF";

Prepend it to your file.

deceze
  • 510,633
  • 85
  • 743
  • 889