0

I've been trying, for some time now, to export a properly encoded and formated CSV file with PHP. But it's just not working. I've tried every tip in every CSV/PHP related thread on SOF, I've checked that the data in my database is UTF-8, it is. I've tried stuff like utf8_encode() on the whole CSV-line, I've checked that the actual PHP file is encoded in UTF-8, but still no success. When I run the file on http://csvlint.io/ I just get:

Your CSV appears to be encoded in ASCII-8BIT. We recommend you use UTF-8.

But I can't find a trace of any other encoding than UTF-8 anywhere in my code.. Basically this is my code:

First, I put all my CSV-rows in an array, then do this:

if (count($array) == 0)
{
    return NULL;
}
ob_start();
$df = fopen("php://output", 'w');
$csv = utf8_encode("header1|header2|header3|header4|header5|header6|header7\r\n");
foreach($array as $line) {
    $csv .= $line . "\r\n";
}
setlocale(LC_ALL, 'sv_SE', "swedish");
fwrite($df, "\xEF\xBB\xBF".$csv);
fclose($df);
return ob_get_clean();

And these are the headers sent:

$now = gmdate("D, d M Y H:i:s");
header("Expires: Tue, 03 Jul 2001 06:00:00 GMT");
header("Cache-Control: max-age=0, no-cache, must-revalidate, proxy-revalidate");
header("Last-Modified: {$now} GMT");
header("Content-Encoding: UTF-8");
header("Content-Type: text/csv; charset=UTF-8");
header("Content-Type: application/force-download");
header("Content-Type: application/octet-stream");
header("Content-Type: application/download");
header("Content-Disposition: attachment;filename={$filename}");
header("Content-Transfer-Encoding: binary");

Any ideas?

COil
  • 7,201
  • 2
  • 50
  • 98
nyy
  • 72
  • 1
  • 12

2 Answers2

1

The issue is the byte-order mark you're prepending to the output in this line:

fwrite($df, "\xEF\xBB\xBF".$csv);

If you change this to simply

fwrite($df, $csv);

You should find the resulting file validates just fine (or at least, the validator doesn't complain about its encoding).

Arguably this is a problem with the validator, since as the Wikipedia article notes,

The Unicode Standard permits the BOM in UTF-8, but does not require or recommend its use.

I don't recommend you use it either, as most software seems not to recognize byte-order marks. But if you must or you simply prefer to, you can safely ignore the warning from CSVLint.


Since that is apparently not the issue, the next thing I'd look at is whether or not the data is being retrieved from the database in UTF-8. (I'll take your word you've already checked carefully to make sure the data is being stored in UTF-8.) If you're using MySQL, this will depend on the configuration of the database server and any options you may be sending the database after connection.

The PHP manual has a section on character sets and MySQL, and there is also this helpful article about using PHP and MySQL together with UTF-8 data. If you're using a different database system, it likely has equivalent configuration options that should be checked.

The only other suggestions I can make are that you

  • Move the call to setlocale higher in the script, before string concatenation begins in the foreach loop. (I don't think this setting affects simple concatenation, but I'm not certain.)

  • Remove the Content-Encoding header from your output, as it is invalid the way it is currently being used.

Community
  • 1
  • 1
  • Hi Simon, thanks for the answer, but I tried this earlier (and now again just to be sure), and it's not working. One could of course choose not to trust the validator, but I think it's correct because I have some Swedish characters in the data that are not displayed correctly, ie: "Uppr��ttande av" should be "Upprättande av" – nyy Jan 18 '17 at 09:57
  • @nyy, I've updated my answer with further suggestions. –  Jan 18 '17 at 10:59
0

Try to use this code:

$filename = 'csv/'.date('Y-m-d_H:i:s').'.csv';
$fp = fopen($filename, 'w');

foreach ($csvData as $fields) {
    fprintf($fp, chr(0xEF).chr(0xBB).chr(0xBF));
    fputcsv($fp, $fields, $delimiter = ';');
}

fclose($fp);
wnull
  • 217
  • 6
  • 21