185

Suppose I have a .csv file with the following content:

 "text, with commas","another text",123,"text",5; 
 "some    without commas","another text",123,"text";
 "some text with  commas or no",,123,"text"; 

How can I parse the content through PHP?

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
smith
  • 5,341
  • 8
  • 31
  • 38
  • You're basically asking if there is a better OOP way to deal w/ CSV parsing than the stock global function approach. I'd say reword the question, as this does not sound like an issue parsing a CSV really. – quickshiftin Feb 04 '12 at 07:33
  • 1
    @quickshiftin sorry about that – smith Feb 04 '12 at 07:36
  • It's fine, I'm just saying... If you want a class [this one](http://code.google.com/p/parsecsv-for-php/) is OK (I've tweaked it a bit in my work tho..) – quickshiftin Feb 04 '12 at 07:43

6 Answers6

246

Just use the function for parsing a CSV file

http://php.net/manual/en/function.fgetcsv.php

$row = 1;
if (($handle = fopen("test.csv", "r")) !== FALSE) {
  while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
    $num = count($data);
    echo "<p> $num fields in line $row: <br /></p>\n";
    $row++;
    for ($c=0; $c < $num; $c++) {
        echo $data[$c] . "<br />\n";
    }
  }
  fclose($handle);
}
thenetimp
  • 9,487
  • 5
  • 29
  • 42
  • 38
    it should be noted that this function does not correctly deal with quotes in CSV. Specifically, it can't deal with this example as found in wikipedia: https://en.wikipedia.org/wiki/Comma-separated_values#Example there has been an open bug, but it has been closed as "wont fix" https://bugs.php.net/bug.php?id=50686 – amenthes Sep 07 '15 at 21:30
  • 2
    Does not work correctly on columns with line breaks in their content too – Professorval Jul 23 '21 at 00:39
165

A bit shorter answer since PHP >= 5.3.0:

    $csvFile = file('../somefile.csv');
    $data = [];
    foreach ($csvFile as $line) {
        $data[] = str_getcsv($line);
    }
Aldekein
  • 3,538
  • 2
  • 29
  • 33
  • 32
    Note that this doesn't work if you have any newlines in the actual values (not the line delimiters at the end of each csv line), because the `file` function splits on newlines and isn't aware of the quotation marks that CSV uses to contain field values. – Jordan Lev Jan 13 '16 at 19:52
  • @JordanLev so what do you recommend then? – Julix May 15 '17 at 00:39
  • 6
    @Julix use [the accepted answer](http://stackoverflow.com/a/9139210/477513) . This shorter version is nice if you know the imported data will never have linebreaks within a single value, but otherwise the more robust solution is worth the extra lines of code. – Jordan Lev May 15 '17 at 03:57
  • I ended up encoding before saving to CSV and decoding while reading - http://php.net/rawurlencode - that ensures no line-breaks, right? - or does it loose them entirely (can there be line breaks in URL encoding?) – Julix May 15 '17 at 04:24
  • This answer perfectly works for new line and uses the magic of php file function. Thanks – Madhab452 Aug 05 '17 at 10:00
  • str_getcsv method contains "separator" parameter default "," if you need you can use ";" also. – devugur Aug 19 '22 at 09:32
150

Handy one liner to parse a CSV file into an array

$csv = array_map('str_getcsv', file('data.csv'));
Maxim Kovalevsky
  • 1,564
  • 1
  • 9
  • 5
  • 28
    Note that this doesn't work if you have any newlines in the actual values (not the line delimiters at the end of each csv line), because the `file` function splits on newlines and isn't aware of the quotation marks that CSV uses to contain field values. – Jordan Lev Jan 13 '16 at 19:51
  • 5
    How to use different delimiter? ( ; instead of , ) – sdd Apr 03 '17 at 11:00
  • 13
    use the following to fix new line problem: array_map('str_getcsv', file('data.csv' , FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES)); – Robert Sinclair Jun 07 '17 at 05:03
  • 34
    You know, it is nice to give credits to respective authors: http://php.net/manual/en/function.str-getcsv.php#114764 – xZero Jun 13 '17 at 11:31
  • @Maxim Kovalevsky, how to skip heading or first line – jvk Mar 01 '18 at 09:44
  • This might be obvious to others, but I just spent two hours trying to figure out what's wrong with PHP when my string quote was set to `'`. Make sure it's `"`, then this and other scripts will work. – confetti Apr 11 '20 at 12:00
  • This is excellent unless one has very large files. – DanimalReks Oct 08 '20 at 22:01
22

Just discovered a handy way to get an index while parsing. My mind was blown.

$handle = fopen("test.csv", "r");
for ($i = 0; $row = fgetcsv($handle ); ++$i) {
    // Do something will $row array
}
fclose($handle);

Source: link

Community
  • 1
  • 1
quartarian
  • 557
  • 4
  • 10
  • 1
    Our server had PHP 5.2.9 and I am unable to access str_getcsv because of that. fgetcsv on the other hand dates all the way to PHP 4, so this was helpful. Thanks. – Benjamin May 11 '17 at 17:58
  • Just wondering why a for loop was used instead of a while loop? – Valerie Dec 28 '20 at 22:46
  • 2
    @valerie I almost always need an index while parsing CSVs. The for loop provides the index without a separate declaration and incrementer. – quartarian Dec 30 '20 at 02:51
9

I love this

$data = str_getcsv($CsvString, "\n"); //parse the rows
foreach ($data as &$row) {
    $row = str_getcsv($row, "; or , or whatever you want"); //parse the items in rows 
    $this->debug($row);
}

in my case I am going to get a csv through web services, so in this way I don't need to create the file. But if you need to parser with a file, it's only necessary to pass as string

brasofilo
  • 25,496
  • 15
  • 91
  • 179
ackuser
  • 5,681
  • 5
  • 40
  • 48
  • This didn't work correctly for me. For example, this: `aaa,bbb,"ccc\nddd",eee` was parsed into two lines (instead of the desired one line) instead of one. It seems that `"` is not recognized as enclosure when it appears inside the field (rather than in its beginning or end). So `$data = str_getcsv(..)` can be replaced with `$data = explode(..)`, which I'm guessing is more efficient, and conveys intention better... – obe May 22 '20 at 00:21
6

I have been seeking the same thing without using some unsupported PHP class. Excel CSV doesn't always use the quote separators and escapes the quotes using "" because the algorithm was probably made back in the '80s or something. After looking at several .csv parsers in the comments section on PHP.NET, I have seen ones that even used callbacks or eval'd code and they either didn't work like needed or simply didn't work at all. So, I wrote my own routines for this and they work in the most basic PHP configuration. The array keys can either be numeric or named as the fields given in the header row. Hope this helps.

    function SW_ImplodeCSV(array $rows, $headerrow=true, $mode='EXCEL', $fmt='2D_FIELDNAME_ARRAY')
    // SW_ImplodeCSV - returns 2D array as string of csv(MS Excel .CSV supported)
    // AUTHOR: tgearin2@gmail.com
    // RELEASED: 9/21/13 BETA
      { $r=1; $row=array(); $fields=array(); $csv="";
        $escapes=array('\r', '\n', '\t', '\\', '\"');  //two byte escape codes
        $escapes2=array("\r", "\n", "\t", "\\", "\""); //actual code

        if($mode=='EXCEL')// escape code = ""
         { $delim=','; $enclos='"'; $rowbr="\r\n"; }
        else //mode=STANDARD all fields enclosed
           { $delim=','; $enclos='"'; $rowbr="\r\n"; }

          $csv=""; $i=-1; $i2=0; $imax=count($rows);

          while( $i < $imax )
          {
            // get field names
            if($i == -1)
             { $row=$rows[0];
               if($fmt=='2D_FIELDNAME_ARRAY')
                { $i2=0; $i2max=count($row);
                  while( list($k, $v) = each($row) )
                   { $fields[$i2]=$k;
                     $i2++;
                   }
                }
               else //if($fmt='2D_NUMBERED_ARRAY')
                { $i2=0; $i2max=(count($rows[0]));
                  while($i2<$i2max)
                   { $fields[$i2]=$i2;
                     $i2++;
                   }
                }

               if($headerrow==true) { $row=$fields; }
               else                 { $i=0; $row=$rows[0];}
             }
            else
             { $row=$rows[$i];
             }
    
            $i2=0;  $i2max=count($row); 
            while($i2 < $i2max)// numeric loop (order really matters here)
            //while( list($k, $v) = each($row) )
             { if($i2 != 0) $csv=$csv.$delim;

               $v=$row[$fields[$i2]];

               if($mode=='EXCEL') //EXCEL 2quote escapes
                    { $newv = '"'.(str_replace('"', '""', $v)).'"'; }
               else  //STANDARD
                    { $newv = '"'.(str_replace($escapes2, $escapes, $v)).'"'; }
               $csv=$csv.$newv;
               $i2++;
             }

            $csv=$csv."\r\n";

            $i++;
          }

         return $csv;
       }

    function SW_ExplodeCSV($csv, $headerrow=true, $mode='EXCEL', $fmt='2D_FIELDNAME_ARRAY')
     { // SW_ExplodeCSV - parses CSV into 2D array(MS Excel .CSV supported)
       // AUTHOR: tgearin2@gmail.com
       // RELEASED: 9/21/13 BETA
       //SWMessage("SW_ExplodeCSV() - CALLED HERE -");
       $rows=array(); $row=array(); $fields=array();// rows = array of arrays

       //escape code = '\'
       $escapes=array('\r', '\n', '\t', '\\', '\"');  //two byte escape codes
       $escapes2=array("\r", "\n", "\t", "\\", "\""); //actual code

       if($mode=='EXCEL')
        {// escape code = ""
          $delim=','; $enclos='"'; $esc_enclos='""'; $rowbr="\r\n";
        }
       else //mode=STANDARD 
        {// all fields enclosed
          $delim=','; $enclos='"'; $rowbr="\r\n";
        }

       $indxf=0; $indxl=0; $encindxf=0; $encindxl=0; $enc=0; $enc1=0; $enc2=0; $brk1=0; $rowindxf=0; $rowindxl=0; $encflg=0;
       $rowcnt=0; $colcnt=0; $rowflg=0; $colflg=0; $cell="";
       $headerflg=0; $quotedflg=0;
       $i=0; $i2=0; $imax=strlen($csv);   

       while($indxf < $imax)
         {
           //find first *possible* cell delimiters
           $indxl=strpos($csv, $delim, $indxf);  if($indxl===false) { $indxl=$imax; }
           $encindxf=strpos($csv, $enclos, $indxf); if($encindxf===false) { $encindxf=$imax; }//first open quote
           $rowindxl=strpos($csv, $rowbr, $indxf); if($rowindxl===false) { $rowindxl=$imax; }

           if(($encindxf>$indxl)||($encindxf>$rowindxl))
            { $quoteflg=0; $encindxf=$imax; $encindxl=$imax;
              if($rowindxl<$indxl) { $indxl=$rowindxl; $rowflg=1; }
            }
           else 
            { //find cell enclosure area (and real cell delimiter)
              $quoteflg=1;
              $enc=$encindxf; 
              while($enc<$indxl) //$enc = next open quote
               {// loop till unquoted delim. is found
                 $enc=strpos($csv, $enclos, $enc+1); if($enc===false) { $enc=$imax; }//close quote
                 $encindxl=$enc; //last close quote
                 $indxl=strpos($csv, $delim, $enc+1); if($indxl===false)  { $indxl=$imax; }//last delim.
                 $enc=strpos($csv, $enclos, $enc+1); if($enc===false) { $enc=$imax; }//open quote
                 if(($indxl==$imax)||($enc==$imax)) break;
               }
              $rowindxl=strpos($csv, $rowbr, $enc+1); if($rowindxl===false) { $rowindxl=$imax; }
              if($rowindxl<$indxl) { $indxl=$rowindxl; $rowflg=1; }
            }

           if($quoteflg==0)
            { //no enclosured content - take as is
              $colflg=1;
              //get cell 
             // $cell=substr($csv, $indxf, ($indxl-$indxf)-1);
              $cell=substr($csv, $indxf, ($indxl-$indxf));
            }
           else// if($rowindxl > $encindxf)
            { // cell enclosed
              $colflg=1;
     
             //get cell - decode cell content
              $cell=substr($csv, $encindxf+1, ($encindxl-$encindxf)-1);

              if($mode=='EXCEL') //remove EXCEL 2quote escapes
                { $cell=str_replace($esc_enclos, $enclos, $cell);
                }
              else //remove STANDARD esc. sceme
                { $cell=str_replace($escapes, $escapes2, $cell);
                }
            }

           if($colflg)
            {// read cell into array
              if( ($fmt=='2D_FIELDNAME_ARRAY') && ($headerflg==1) )
               { $row[$fields[$colcnt]]=$cell; }
              else if(($fmt=='2D_NUMBERED_ARRAY')||($headerflg==0))
               { $row[$colcnt]=$cell; } //$rows[$rowcnt][$colcnt] = $cell;

              $colcnt++; $colflg=0; $cell="";
              $indxf=$indxl+1;//strlen($delim);
            }
           if($rowflg)
            {// read row into big array
              if(($headerrow) && ($headerflg==0))
                {  $fields=$row;
                   $row=array();
                   $headerflg=1;
                }
              else
                { $rows[$rowcnt]=$row;
                  $row=array();
                  $rowcnt++; 
                }
               $colcnt=0; $rowflg=0; $cell="";
               $rowindxf=$rowindxl+2;//strlen($rowbr);
               $indxf=$rowindxf;
            }

           $i++;
           //SWMessage("SW_ExplodeCSV() - colcnt = ".$colcnt."   rowcnt = ".$rowcnt."   indxf = ".$indxf."   indxl = ".$indxl."   rowindxf = ".$rowindxf);
           //if($i>20) break;
         }

       return $rows;
     }

...bob can now go back to his speadsheets

Ranjul Arumadi
  • 109
  • 1
  • 2
  • 11
HELPFUL_SHADOW
  • 119
  • 3
  • 1