0

I am using PHP and IIS 8 on a Windows Server 2012. I create an UTF-8 encoded file on the server and then I push it to the client with the following code:

                // Start upload to client
                $fullPath = $full_name;
                $fsize = filesize($fullPath);
                $path_parts = pathinfo($fullPath);
                $ext = strtolower($path_parts["extension"]);
                header("Content-type: application/octet-stream");
                header("Content-Disposition: attachment; filename=\"".$file_name."\"");
                header("Content-length: $fsize");
                header("Cache-control: private");
                readfile($full_name);

When the client receives the file it contains a starting BOM and (oh surprise) 3 characters are missing at the end of the file. I have checked the file on the server and it is saved there correctly (no BOM saved and the three missing characters are there).

The PHP script which creates and uploads the file has the header

header('Content-Type:text/html; charset=UTF-8');

I have resorted to add 3 times "line feed" at the end of the file to get the three missing characters. I could also have added +3 to the variable $fsize, but I do not feel comfortable doing that kind of cheat (it might shoot back). I think there should be a more elegant way out of this.

Curiously I am using the same code on a Win7 machine with IIS 7.5 and there is no issue there with the UTF-8 BOM addition. The PHP directory is a copy of the directory on the Win7 machine, including the php.ini file.

Can someonen see what am I missing?

Thanks in advance for your help.

Fabio
  • 23,183
  • 12
  • 55
  • 64
Manuel
  • 3
  • 1
  • 2
  • Don't send text as `application/octet-stream`. Send it as text. – Matt Ball May 13 '13 at 14:11
  • the last 3 characters are stripped, because you send the correct Content-Length header (without the bom) & the browser cuts everything after reached it. – pozs May 13 '13 at 14:12
  • Thanks Matt, that is reasonable as I am just sending text. I just changed my code, but the BOM is still being added. – Manuel May 13 '13 at 14:14
  • Thanks pozs, that's what I thought and that's why I am adding three additional "line feeds" at the end of the file: just to get the three missing characters. I am looking for a more elegant solution than this. – Manuel May 13 '13 at 14:19

1 Answers1

0

You should check for a BOM in the script file too. Usually if your IDE saves UTF-8 files with BOM, it is before the opening <?php tag, so php treats as output.

pozs
  • 34,608
  • 5
  • 57
  • 63
  • I'm afraid that is not the cause either, I am using TextPad. I am forcing it not to write a BOM for UTF-8 files. I also checked that one. – Manuel May 13 '13 at 14:21
  • Note that the BOM could be also in the included files too. Have you done a BOM search yet? http://stackoverflow.com/questions/204765/elegant-way-to-search-for-utf-8-files-with-bom – pozs May 13 '13 at 14:32
  • Thanks pozs. That was the issue. I must have modified an included file with Notepad to change a password and it added the BOM to it. – Manuel May 13 '13 at 14:52
  • Good to know that also BOMs in included files can affect the main file. Case solved and something new learnt today. Thans again. – Manuel May 13 '13 at 14:53