6

i need to parse a CSV string and in a first step, I would like to get lines from it. I use str_getcsv function to do it, but it seems to fail even in the most basic scenario - new line surrounded by quotes.

$rows = '7;"Hi
";3';
$array = str_getcsv($rows,"\n",'"');
print_r($array);

The result should be array with just one value, but I got two - surprisingly where the quoted new line is...

result:
Array
(
    [0] => 7;"Hi
    [1] => ;3
)

What am I doing wrong?

EDIT: Weird, but when I tried

$rows = '7;"Hi
";3
8;Hello;6';

The result is

Array
(
    [0] => 7;"Hi
    [1] => ;3
8;Hello;6
)
Charlestone
  • 1,248
  • 1
  • 13
  • 27
  • PHP CSVs inherent functions are variously unreliable and should be used with caution, if at all. It's simply enough simply to write your own custom functions. – Martin Aug 21 '17 at 12:09
  • Try `$array = str_getcsv($rows,PHP_EOL,'"');` – Martin Aug 21 '17 at 12:09
  • See also you [need the correct character encoding and BOM](https://stackoverflow.com/questions/4348802/how-can-i-output-a-utf-8-csv-in-php-that-excel-will-read-properly) with your CSV file. – Martin Aug 21 '17 at 12:11

5 Answers5

1

It seems like str_getcsv() does not allow to have newlines in string fields and parses the lines by newline without checking, whether the new line is in field or not. It is necessary to use something different than this function, it just does not work that well.

I found code which directly parses CSV without a need to parse it into lines at man page of the str_getcsv() function from user normadize -a- gmail -d- com. All whats missing is that it does not use values from the first line as keys for other lines.

function parse_csv ($csv_string, $delimiter = ",", $skip_empty_lines = true, $trim_fields = true)
{
    $enc = preg_replace('/(?<!")""/', '!!Q!!', $csv_string);
    $enc = preg_replace_callback(
        '/"(.*?)"/s',
        function ($field) {
            return urlencode(utf8_encode($field[1]));
        },
        $enc
    );
    $lines = preg_split($skip_empty_lines ? ($trim_fields ? '/( *\R)+/s' : '/\R+/s') : '/\R/s', $enc);
    return array_map(
        function ($line) use ($delimiter, $trim_fields) {
            $fields = $trim_fields ? array_map('trim', explode($delimiter, $line)) : explode($delimiter, $line);
            return array_map(
                function ($field) {
                    return str_replace('!!Q!!', '"', utf8_decode(urldecode($field)));
                },
                $fields
            );
        },
        $lines
    );
}

?>
Charlestone
  • 1,248
  • 1
  • 13
  • 27
0

PHP doesn't seem to allow newlines within column data. You can probably work around this though:

$row = '7;"Hi'.PHP_EOL.'";3'; 
$row=str_replace(PHP_EOL,"\0",$row); 
$array = str_getcsv($row,"\n",'"');
$array = array_map(function ($v) {
   return str_replace("\0",PHP_EOL,$v);
},$array);
print_r($array);

Array
(
    [0] => 7;"Hi
";3
)
apokryfos
  • 38,771
  • 9
  • 70
  • 114
0

Working on

php -v PHP 5.6.30 (cli) (built: Feb 7 2017 16:18:37)

php > $row = "\"abc\"\n;\"def\"\n;\"123\"";
php > $array = str_getcsv($row,"\n",'"');
php > print_r($array);
Array
(
    [0] => abc
    [1] => ;"def"
    [2] => ;"123"
)

and this

php > $row = "\"abc\"\n\"def\"\n\"123\"";
php > 
php > 
php > $array = str_getcsv($row,"\n",'"');
php > print_r($array);
Array
(
    [0] => abc
    [1] => def
    [2] => 123
)

etc (all other cases).

May be have you another symbols in delimiters in your var statement here distinguishing from \n or \r or \r\n?

Oleg
  • 109
  • 4
  • This works for me too... The problem occurs when the newline is WITHIN the quotes: `$row = '"abc\ndef"\n"123"'`. This should result in `Array ( [0] => "abc\ndef", [1] => 123) ` but the actual result is `array ([0] => abc, [1] => def, [2] => 123)` – Charlestone Sep 22 '19 at 08:02
0

What am I doing wrong?

You called the function with the wrong parameters ...

http://php.net/manual/en/function.str-getcsv.php:

delimiter
Set the field delimiter (one character only).

The second parameter is the field delimiter, and that is ;, not the newline.

$rows = '7;"Hi
";3';
$array = str_getcsv($rows,';','"');
var_dump($array);

// output:
array(3) {
  [0]=>
  string(1) "7"
  [1]=>
  string(4) "Hi
"
  [2]=>
  string(1) "3"
}

EDIT:

The result should be array with just one value

Why one value? What is your actual field delimiter character supposed to be here? I assumed it was the ; ... but if you expect to get only one value from the shown example input, then what is the delimiter? Or how is it even CSV …?

2nd Edit:

first I need to get lines - field delimiter is a newline

That is not really how str_getcsv works - that assumes that you have just one single line of your CSV data as your input string already. And you can’t really use something like a simple explode at newline to “get” single lines - because the newline character can also be inside the fields.

fgetcsv would be the proper function to achieve both ... but that operates on files, not string data. Perhaps PHP’s various streams could help with that somehow, if you “write” your string data into a temp file or memory, so that you could actually use fopen to get a readable stream that fgetcsv can then work with ... this question can give you an idea of how that could work: how to use fgetcsv with strings

CBroe
  • 91,630
  • 14
  • 92
  • 150
  • It was just an example to what is happening to me... I need to parse much much larger csv string to lines, so I could parse line by line to get fields. Its all written in the first sentence... – Charlestone Aug 21 '17 at 12:26
  • if I used ';' I would just get bunch of values, not nowing where one line ends and another begins – Charlestone Aug 21 '17 at 12:28
  • So what are the actual delimiters then? What is the field delimiter? Your “example data” does not really make that clear. – CBroe Aug 21 '17 at 12:32
  • first I need to get lines - field delimiter is a newline. After that, if I wanted to parse a line, the delimiter would be ';'. But if I used ';' in the beggining on "second example" the result would be something like array('7','Hi\n','3\n','7','Hello','6') – Charlestone Aug 21 '17 at 12:54
  • _“first I need to get lines - field delimiter is a newline”_ - that is not really how `str_getcsv` works - that assumes that you have just _one line_ of your CSV data in your input string already. And you can’t really use something like a simple explode at newline to “get” single lines - _because_ the newline character can also be inside the fields. – CBroe Aug 21 '17 at 13:14
  • `fgetcsv` would be the proper function to achieve both ... but that operates on files, not string data. Perhaps PHP’s various _streams_ could help with that somehow, if you “write” your string data into a temp file or memory, so that you could actually use fopen to get a readable stream that fgetcsv can then work with ... – CBroe Aug 21 '17 at 13:14
  • @Charlestone, I added the last two comments to the answer, plus a reference to another question about how to use fgetcsv to read from a string, instead of a file. – CBroe Aug 21 '17 at 16:59
0

Use fgetcsv instead. This method handles newlines enclosed with quotes. You can also work with a string creating a resource from it.


        $stream = fopen('data://text/plain,' . $csv, 'r');
        $rows = [];
        while (($row = fgetcsv($stream)) !== false) {
            $rows[] = $row;
        }
        
        var_dump($rows)
Prescol
  • 615
  • 5
  • 12