2

I have a csv reader (code at the bottom) and the first item of the first row returns unreadable characters while the second row or second item from the first row is just fine. If found out I can view these charaters when I array_map the variable with utf8_encode.

var_dump($data[0][0])
for ($i = 0; $i < strlen($data[0][0]); $i++){
    var_dump($data[0][0][$i]);
}
var_dump(array_map("utf8_encode", $data[0])[0]);

returns

string(11) "voornaam"
string(1) "�" string(1) "�" string(1) "�" string(1) "v" string(1) "o" string(1) "o" string(1) "r" string(1) "n" string(1) "a" string(1) "a" string(1) "m"
string(14) "voornaam"

How can I remove these charaters when I don't even know what to look for?

CSV Reader

Variables:

$this->fp = fopen($file_name, "r");
$this->parse_header = false;
$this->delimiter = ",";
$this->length = 1000;

Actual code:

function get($max_lines = 0)
{
    //if $max_lines is set to 0, then get all the data

    $data = array();

    if ($max_lines > 0)
        $line_count = 0;
    else
        $line_count = -1; // so loop limit is ignored

    while ($line_count < $max_lines && ($row = fgetcsv($this->fp, $this->length, $this->delimiter)) !== FALSE) {
        if ($this->parse_header) {
            foreach ($this->header as $i => $heading_i) {
                $row_new[$heading_i] = $row[$i];
            }
            $data[] = $row_new;
        } else {
            $data[] = $row;
        }

        if ($max_lines > 0)
            $line_count++;
    }
    return $data;
}

The function is being called with no max_lines

Wanjia
  • 799
  • 5
  • 19
  • Your CSV file has a UTF-8 BOM. – deceze Nov 13 '19 at 14:04
  • All I did was "save as csv", is it possible to remove the BOM or convert it to regular UTF-8? – Wanjia Nov 13 '19 at 14:06
  • 1
    Depends on what program you "saved as csv" from… see https://stackoverflow.com/a/48750807/476. – deceze Nov 13 '19 at 14:13
  • Found the answer to what I was searching as a comment in the above link. I do think this question is still important for people who might not know about UTF-8 DOM like I did – Wanjia Nov 13 '19 at 14:18

0 Answers0