2

I'm currently working on a webapplication that will take the data from a CSV file, and convert it into an array for easier access to specific data in the file. I do however have a problem. Currently my code looks something like this:

$data = file_get_contents('data.csv');
$array = str_getcsv($data,"\n");
$i = 0;

foreach($array as $row){
     $newArray[$i] = str_getcsv($row,";");
     $i++;
}

This works fine for the most part, but it messes up when there is a line break inside an individual value. I'm working with product descriptions so some of the companies put intentional line breaks in their descriptions. When I open the file in Excel I can see these linebreaks clearly, and my question is now, how do I deal with them? I've tried a lot of different approaches and read a lot online, but nothing seems to work for me.

I hope you can help me find a solution.

EDIT

Here is an example from the CSV file

5124;"Altid billig el og 5 stjerner på Trustpilot";"Altid billig el og 5 stjerner på Trustpilot.@ Vi har kun ét produkt og du skal ikke længere spekulere i om du nu har en billig elleverandør efter 6 måneder eller lign. Vi går efter at være blandt de billigste elleverandører, hvilket vi også beviser hvert år."

I don't know if it makes much sense, but this where the problem is. In this particular exampel there is a "forced" line break when I open it in excel, where i placed a bold "@". As far as I can see this should be valid?

resonance
  • 47
  • 4
  • 12
  • Just curious: Is the CSV file valid? Are there double quotes around the value containing the line break? You don't give an example, so I cannot tell. – KIKO Software Jun 09 '16 at 08:30
  • I added an example in the post now. @KIKOSoftware – resonance Jun 09 '16 at 09:29
  • I don't think line breaks inside values are permissible in the CSV format. For instance, this page: http://edoceo.com/utilitas/csv-file-format starts with **Each record is one line**. More details here: https://www.ietf.org/rfc/rfc4180.txt – KIKO Software Jun 09 '16 at 09:36
  • Are you able to export the data with different settings? Excel deals different as other csv handlers. http://stackoverflow.com/questions/1241220/generating-csv-file-for-excel-how-to-have-a-newline-inside-a-value – Markus Jun 09 '16 at 09:41
  • @KIKO Err... first bullet point on that page: *"Each record is one line - Line separator may be LF (0x0A) or CRLF (0x0D0A), __a line separator may also be embedded in the data (making a record more than one line but still acceptable).__"* – deceze Jun 09 '16 at 09:42
  • Yes, I saw that too. Sorry. – KIKO Software Jun 09 '16 at 09:43
  • For me his example is working fine. When I have `\r\n` or `\n` instead of the @. Just forget about Excel and check what `print_r($newArray)`gives you. – Markus Jun 09 '16 at 09:47

1 Answers1

2

str_getcsv expects one row of CSV formatted text, you cannot use it to parse an entire file. Let fgetcsv read and parse the file line by line:

$file = fopen('data.csv', 'r');
$data = [];

while ($row = fgetcsv($file)) {
     $data[] = $row;
}

fclose($file);

var_dump($data);
deceze
  • 510,633
  • 85
  • 743
  • 889