4

I've successfully written a script that downloads a CSV file with the use of cURL and then parses the CSV into a array as follows:

$rows = array_map(function($a) {
    return str_getcsv($a, $delimiter);
}, explode("\n", $result));

Then I iterate over $rows using foreach to save something into the database.

The script works fine, but when using a larger CSV file (>10.000 lines), the script becomes rather slow and gives a lot more errors.

I would like to cut the CSV file into pieces, so not the whole file will be imported into a variable. I found the following solution, but that still processes the whole file at once.

Is there a method to cut the CSV into pieces and run the database function several times? Or is there a better way to process a large CSV file like this?

I'm relatively new to processing large files, so please be kind!

localheinz
  • 9,179
  • 2
  • 33
  • 44
Max
  • 803
  • 3
  • 10
  • 24
  • Maybe if you show a bit more of your code we might be able to be more helpful – RiggsFolly Aug 10 '17 at 12:29
  • 2
    Use the solution you linked to read the csv line by line and after every x lines you write them all to a file, clear memory and start a new file. – Soronbe Aug 10 '17 at 12:31

1 Answers1

7

Save the file somewhere and then process it in chunks like this:

<?php
$filePath = 'big.csv';

//How many rows to process in each batch
$limit = 100;

$fileHandle = fopen($filePath, "r");
if ($fileHandle === FALSE)
{
    die('Error opening '.$filePath);
}

//Set up a variable to hold our current position in the file
$offset = 0;
while(!feof($fileHandle))
{
    //Go to where we were when we ended the last batch
    fseek($fileHandle, $offset);

    $i = 0;
    while (($currRow = fgetcsv($fileHandle)) !== FALSE)
    {
        $i++;

        //Do something with the current row
        print implode(', ', $currRow)."\n";

        //If we hit our limit or are at the end of the file
        if($i >= $limit)
        {
            //Update our current position in the file
            $offset = ftell($fileHandle);

            //Break out of the row processing loop
            break;
        }
    }
}

//Close the file
fclose($fileHandle);
Thaiseer
  • 92
  • 7
Rob Ruchte
  • 3,569
  • 1
  • 16
  • 18
  • If limit is bigger then quantity of rows we can: $chunkValOpt = [5000,1000, 500, 100, 10, 2, 1]; $chunk = 1; foreach ($chunkValOpt as $value){ $result = $this->chooseChunkSize( $value, $filePath ) ; if($result != 0){ $chunk = $value; break; } } //// chooseChunkSize( $value, $filePath ) - is nearly same function, which returns $tmp counter, where the second while contains only: $i++;if($i >= $chunk)$tmp++; – Vit Feb 21 '20 at 14:37