5

I have a PHP script that reads from a CSV file, the file is in UTF-8 format and the code below is treating it as ASCII. How can I change the code to read the file as UTF-8?

if (($handle = fopen("books.csv", "r")) === FALSE)
  throw new Exception("Couldn't open books.csv");
while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {

[EDIT] One of the issues with my current code is that the first value on the first line always has the three bytes that identifies UTF-8 files appended at the beginning. So I guess a solution that operates on a value by value or a row by row might not be good enough?

Bishoy
  • 705
  • 9
  • 24

1 Answers1

1

Use fgets() get file all string in variable $date, then mb_convert_encoding() convert encoding, then str_getcsv() convert string to array.

if (($handle = fopen("books.csv", "r")) === FALSE)
    throw new Exception("Couldn't open books.csv");

$data = "";

// get file all strin in data
while (!feof($handle)) {
    $data .= fgets($handle, 5000);
}

// convert encoding
$data = mb_convert_encoding($data, "UTF-8", "auto");

// str_getcsv
$array = str_getcsv($data);
Jin.C
  • 121
  • 5
  • $data is an array in my code, is there a way to apply this on the stream before using it in fgetcsv? – Bishoy Feb 10 '17 at 01:30
  • Sorry, I write the wrong variable, and I change the function. – Jin.C Feb 10 '17 at 01:57
  • Still not working :) the error is saying first parameter should be a string – Bishoy Feb 10 '17 at 02:00
  • Thanks Jin for trying to help, but this didn't remove the three mangled characters that appears in the first item on the first line – Bishoy Feb 10 '17 at 03:56
  • How about use fgets() get file all string in variable $date, then mb_convert_encoding() convert encoding, then str_getcsv() convert string to array. – Jin.C Feb 10 '17 at 04:20
  • that's pretty much what I am looking for can you help with example code how to get from fgets into string? – Bishoy Feb 10 '17 at 05:54
  • same results mate, I guess convert method here is written convert from 'auto' to UTF-8 and in my case the file is already UTF-8 and PHP is not able to skip the UTF-8 identification character (first three bytes). So I am looking for a way to make PHP understand it is reading a UTF-8 file not convert the file content to UTF-8. – Bishoy Feb 10 '17 at 07:46