0

I have a form to upload a CSV file:

<form method="post" action="#" enctype="multipart/form-data" accept-charset="utf-8">
  <p><input type="file" name="file" /></p>
  <p><input type="submit" /></p>
</form>

On my PHP script I do the following:

$temp = $_FILES["file"]["tmp_name"];

$fobject = new SplFileObject($temp);
$fobject->setFlags(SplFileObject::READ_CSV);
$fobject->setCsvControl(',', '"');

$data = [];
foreach($fobject as $line) {
$data[] = $line;
  print_r($line);
}

Now there is a BOM in the file which shows as a



in the first CSV row. How can I remove this?

I Googled and tried some solutions (How to remove multiple UTF-8 BOM sequences before "<!DOCTYPE>"?) but did not work.

I can use str_replace, but this does not seem like a best practice to me.

Community
  • 1
  • 1
almo
  • 6,107
  • 6
  • 43
  • 86
  • Is the first row a header row? – userDEV Oct 31 '14 at 18:02
  • No, there is no header row. – almo Oct 31 '14 at 18:02
  • 1
    if you temporarily remove the accept-charset="utf-8" in the html tag, do you get that problem? Does your html doctype tag contain any attributes? Did you see this post? http://stackoverflow.com/questions/10290849/how-to-remove-multiple-utf-8-bom-sequences-before-doctype?lq=1 – userDEV Oct 31 '14 at 19:16
  • This link is actually what I tried and did not work. But now I found out that I have to call the function before doing utf8_decode. Thanks! – almo Oct 31 '14 at 19:40

1 Answers1

0

Original post taken from here: chrisguitarguy/no_bom.php

<?php
$file = new \SplFileObject('some_file_with_bom.csv');
// http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
$bom = pack('CCC', 0xEF, 0xBB, 0xBF);
$first = true;
foreach ($file as $line) {
    if ($first && substr($line, 0, 3) === $bom) {
        $line = substr($line, 3);
    }

    $first = false;

    // your lines don't have a BOM, do your stuff
}

Worked for me !

Egor Guriyanov
  • 163
  • 2
  • 7