9

I have excel(file.xls)/csv(file.csv) file that contains/will contain hundreds of thousands of entry, even millions I guess. Is it possible to split this one to multiple file? Like file.xls to file1.xls, file2.xls, file3.xls and so on.

Are there any libraries to use? Is this possible on PHP? or how about javascript? On where I can specify how many rows to be included on each file?

Thanks

Kenneth P.
  • 1,797
  • 3
  • 21
  • 31
  • 1
    Are you talking about an xls (BIFF) file? or about a csv file? or a csv file with an extension of .xls? Excel xls files and CSV files two are very different... and an xls (BIFF) file is limited to 65535 rows, so unless your data is split across several worksheets it's impossible to have millions of entries – Mark Baker May 24 '13 at 10:18
  • both of them. Possible a cross compatibility code that will work both on csv or excel. is this even possible? – Kenneth P. May 24 '13 at 10:20
  • It's possible, but you're potentially going to have memory issues and slow performance when talking about such large volumes of data in an xls. My PHPExcel library can do it, but I'd recommend sticking with straight CSV which can easily be processed a row at a time – Mark Baker May 24 '13 at 10:22
  • PHPExcel, so you did wrote that library. Excellent! For now max entries of CSV is 100,000 for a test. Maybe I lurk around with it if you introduce how to do that? thanks – Kenneth P. May 24 '13 at 10:25

3 Answers3

30

Quick and dirty way of splitting a CSV file into several CSV files

$inputFile = 'input.csv';
$outputFile = 'output';

$splitSize = 10000;

$in = fopen($inputFile, 'r');

$rowCount = 0;
$fileCount = 1;
while (!feof($in)) {
    if (($rowCount % $splitSize) == 0) {
        if ($rowCount > 0) {
            fclose($out);
        }
        $out = fopen($outputFile . $fileCount++ . '.csv', 'w');
    }
    $data = fgetcsv($in);
    if ($data)
        fputcsv($out, $data);
    $rowCount++;
}

fclose($out);
Jon Winstanley
  • 23,010
  • 22
  • 73
  • 116
Mark Baker
  • 209,507
  • 32
  • 346
  • 385
  • 1
    Mark - this works fine, but given the OP's question on splitting large files, the example as written would create 500k files each with 2 rows of csv data. I'm sure this is not agreeable for him. I would recommend upping splitSize above to something more reasonable - say 10k at least. – Ross Nov 14 '14 at 14:27
  • Someting is wrong. Just first line is correct, other lines are in first column with semicolons. How it solve it? Could you improve it by add first line of large CSV (header) to each splited file? – tomasr Jan 17 '22 at 13:17
4

Yes it is possible to do that in PHP and with CSV files. You basically iterate over the large file and chunk each X rows, forwarding those rows to another file.

You find the information how to open the large CSV file as an iterator in this answer here:

Then you need to chunk the iterator each X rows parts. That can be done as outline here:

Just instead of outputting into multiple <ul>...</ul> HTML lists, you copy over into a new files. That basically works like outlined in:

However this time you want to use the SplFileObject::fputcsv method. Take care you use the latest stable PHP for this, otherwise you need do different, see fputcsv().

If the first line of the original file contains column-headers, you might be as well interested in the following:

It just shows some ways to extend / process the incomming file. You might not need the full abstraction done there, just keeping the first line around might do it already.

Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836
0

I think You can also use "split by file size":

$part = 1;

$maxSize = 50;//50 Mb

$fopen = fopen('filename.csv','r') or die ('ERROR');

while (($line = fgetcsv($fopen, 10000, ";")) !== FALSE) {

    $ftowrite = fopen("Part_$part.csv",'a');

    fputcsv($ftowrite,$line);

    clearstatcache();

    $size = filesize ( "review_p$part.csv" ) / 1000000;

    if ($size  > $maxSize) {

        fclose($ftowrite);

        $part++;

    }
}