51

I am generating reports in .xlsx using PHPExcel. It was okay in the initial testing stages with small data sets (tens of rows, 3 sheets), but now when using it on a real production data with over 500 rows in each sheet, it becomes incredibly slow. 48 seconds to generate a file, and when running a report that combines more information, the whole thing fails with Fatal error: Maximum execution time of 30 seconds exceeded in PHPExcel/Worksheet.php on line 1041. Sometimes it's in another PHPExcel file, so I doubt the exact location is that relevant.

Ideally, I would want to speed it up somehow, if possible. If not, then at least increase the execution limit for this script.

The only suggestions I have seen so far was to style in ranges instead of individual cells. Unfortunately, I already do my styling in ranges and it is rather minimal too. Any other suggestions?

SaltyNuts
  • 5,068
  • 8
  • 48
  • 80
  • I use this library for a report that only has 40 rows of data, but is VERY stylized and is 2 pages WIDE, so I have a lot of columns. It is pretty slow in my application, too. I ended up having to extend the max execution time in php.ini – Chris Baker May 12 '11 at 20:10
  • Excel's got a very "terse" internal binary format, and PHPExcel has to jump through great many hoops to convert PHP data types into Excel's internal representations. There's not much you can do to speed things up, other than reducing the amount of data going into the spreadsheet. – Marc B May 12 '11 at 20:27
  • @Marc, except this is Excel 2010 format, which is XML. My guess, based on the memory limit error, is that PHPExcel is using DOM (or worse an old PHP based XML parser) to load the XML file into memory. Something like this should probably be using XMLReader/XMLWriter and forgetting the results after operations are complete (except maybe indexing areas in a file). – Kevin Peno May 12 '11 at 20:42
  • 1
    @Kevin: true, but even within the xml format there's still major amounts of cdata sections with old-style binary blobs carried over from the old formats. – Marc B May 12 '11 at 20:43
  • The OP isn't talking about loading an Excel 2010 workbook, only writing; but PHPExcel uses SimpleXML to parse its XML-based formats.... switching to the push-based XMLReader is on the TODO list, but not the highest priority. XMLWRiter is used when writing XML-based formats. – Mark Baker May 12 '11 at 21:09
  • 1
    Ended up adding set_time_limit(30) on every iteration of some of the loops to extend the maximum time of execution. – SaltyNuts May 17 '11 at 14:37

8 Answers8

63

Is it populating the worksheet? or saving? that you find too slow?

How are you populating the spreadsheet with the data?

  • Using the fromArray() method is more efficient than populating each individual cell, especially if you use the Advanced Value Binder to set cell datatypes automatically.
  • If you're setting values for every individual cell in a sheet using

    $objPHPExcel->getActiveSheet()->setCellValue('A1',$x);
    $objPHPExcel->getActiveSheet()->setCellValue('B1',$y);
    

    use

    $sheet = $objPHPExcel->getActiveSheet();
    $sheet->setCellValue('A1',$x);
    $sheet->setCellValue('B1',$y);
    

    so that you're only accessing the getActiveSheet() method once; or take advantage of the fluent interface to set multiple cells with only a single call to $objPHPExcel->getActiveSheet()

    $objPHPExcel->getActiveSheet()->setCellValue('A1',$x)
                                  ->setCellValue('B1',$y);
    

You've commented on applying styles to ranges of cells:

  • You also have the option to use applyFromArray() to set a whole variety of style settings in one go.
  • It's a lot more efficient if you can apply styles to a column or a row rather than simply to a range

If you're using formulae in your workbook, when saving:

  • Use

    $objWriter->setPreCalculateFormulas(false)
    

    to disable calculating the formulae within PHPExcel itself.

Those are just a few hints to help boost performance, and there's plenty more suggested in the forum threads. They won't all necessarily help, too much depends on your specific workbook to give any absolutes, but you should be able to improve that slow speed. Even the little notebook that I use for development can write a 3 worksheet, 20 column, 2,000 row Excel 2007 file faster than your production server.

EDIT

If it was possible to simply improve the speed of PHPExcel itself, I'd have done so long ago. As it is, I'm constantly performance testing to see how its speed can be improved. If you want faster speeds than PHPExcel itself can give, then there's a list of alternative libraries here.

YakovL
  • 7,557
  • 12
  • 62
  • 102
Mark Baker
  • 209,507
  • 32
  • 346
  • 385
  • 3
    Does anyone also noticed that `$sheet->insertNewRowBefore($row)` alone (inside a loop) also causes some performance hit? – leonard.javiniar Jan 21 '16 at 09:30
  • 5
    @javiniar.leonard - `insertNewRowBefore()` is always expensive, and more so when inserting at the top of the sheet than near the bottom: Always try to avoid using it in a loop; you can specify a number of rows to insert as a second argument, and that's a lot more efficient than inserting each row individually – Mark Baker Jan 21 '16 at 10:16
  • 1
    Also `xdebug` may be a performance bottleneck, I saw that when it is active, PHPExcel is much slower. – tonix Jan 18 '18 at 09:35
  • 1
    xdebug slows code execution regardless of the PHP that you're running, PHPExcel is no exception – Mark Baker Jan 18 '18 at 09:39
  • Super speed: Moving $sheet->insertNewRowBefore($row) outside from loop increased speed drastically, It's very easy to know length of rows and pre populating before loop increased my 250 rows 22 columns file from 6 seconds to milliseconds. – Abdul Rehman Dec 18 '18 at 09:34
15

I ran into this issue as well. Thought I'd throw my two cents in since this question gets so many views.

Setting Cell Values

Instead of setting the value for each cell individually, use the fromArray() method. Taken and modified from the wiki.

$arrayData = array(
array(NULL, 2010, 2011, 2012),
array('Q1',   12,   15,   21),
array('Q2',   56,   73,   86),
array('Q3',   52,   61,   69),
array('Q4',   30,   32,    0),
);

$as = $objPHPExcel->getActiveSheet();

$as->fromArray(
    $arrayData,  // The data to set
    NULL,        // Array values with this value will not be set
    'C3'         // Top left coordinate of the worksheet range where
                 //    we want to set these values (default is A1)
);

Styling Cells

Static

It is also quicker to apply the styles for a range, than to set the style for each cell individually (noticing a pattern??).

$default_style = array(
    'font' => array(
        'name' => 'Verdana',
        'color' => array('rgb' => '000000'),
        'size' => 11
    ),
    'alignment' => array(
        'horizontal' => \PHPExcel_Style_Alignment::HORIZONTAL_CENTER,
        'vertical' => \PHPExcel_Style_Alignment::VERTICAL_CENTER
    ),
    'borders' => array(
        'allborders' => array(
            'style' => \PHPExcel_Style_Border::BORDER_THIN,
            'color' => array('rgb' => 'AAAAAA')
        )
    )
);

// Apply default style to whole sheet
$as->getDefaultStyle()->applyFromArray($default_style);

$titles = array(
    'Name',
    'Number',
    'Address',
    'Telephone'
);

$title_style = array(
    'font' => array(
        'bold' => true
    ),
    'fill' => array(
        'type' => \PHPExcel_Style_Fill::FILL_SOLID,
        'startcolor' => array('rgb' => '5CACEE')
    ),
    'alignment' => array(
        'wrap' => true
    )
);

$as->fromArray($titles, null, 'A1'); // Add titles

$last_col = $as->getHighestColumn(); // Get last column, as a letter

// Apply title style to titles
$as->getStyle('A1:'.$last_col.'1')->applyFromArray($title_style);

Dynamically

I use PHPExcel to check the data given in the spreadsheet with the current data in the database. Since each cell is checked individually, I put the styles in an array (null for no style), and used the loop below to get the range of cells to apply the style to.

/*
 * $row is previously set in a loop iterating through each 
 *     row from the DB, which is equal to a spreadsheet row.
 * $styles = array(0 => 'error', 1 => 'error', 2 => null, 3 => 'changed', ...);
 */
$start = $end = $style = null;
foreach ($styles as $col => $s) {
    if (!$style && !$s) continue;
    if ($style === $s) {
        $end = $col;
    } else {
        if ($style) {
            $array = null;
            switch ($style) {
                case 'changed':
                    $array = $this->changed_style;
                    break;
                case 'error':
                    $array = $this->error_style;
                    break;
                case 'ignored':
                    $array = $this->ignored_style;
                    break;
            }
            if ($array) { 
                $start = \PHPExcel_Cell::stringFromColumnIndex($start);
                $end = \PHPExcel_Cell::stringFromColumnIndex($end);
                $as->getStyle($start.$row.':'.$end.$row)->applyFromArray($array);
            }
        }
        $start = $end = $col;
        $style = $s;
    }
} 
Katrina
  • 1,922
  • 2
  • 24
  • 42
  • 1
    fromArray is incredebly faster! – Andreas May 31 '16 at 21:18
  • For fill color - it should be `fillType` and `startColor` instead of `type` and `startcolor` in the `fill` array (at least in PhpSpreadsheet version 1.28.0) – obe Mar 31 '23 at 20:16
6

I was running into the same issue - had about 450 rows with 11 columns of data that I was trying to write, and I kept running up against the 30-second timeout. I was able to get the execution time down to 2 seconds or less by adding all of my new rows in bulk, and then going through and setting the cell content after the fact. In other words, I insert 450 rows in one call to insertNewRowBefore(), and then loop through and set content within those rows later.

Like so:

$num_rows = count($output_rows);
$last_row = $sheet->getHighestRow();
$row = $last_row + 1;
$sheet->insertNewRowBefore($row, $num_rows);
// Now add all of the rows to the spreadsheet
foreach($output_rows as $line) {
    $i = 0;
    foreach($line as $val) {
        // Do your setCellValue() or setCellValueByColumnAndRow() here
        $i++;
    }
    $row++;
}
JaredC
  • 1,360
  • 14
  • 11
  • I actually never explicitly add rows, except for a couple header rows that I add after all the data has been entered, just to simplify calculating the data cell coordinates. – SaltyNuts Apr 05 '12 at 15:24
  • 2
    I tried your suggestion with `->insertNewRowBefore()`. I am writing roughly 6000 rows in a two sheets. The cycle `->setCellValue` without using `->insertNewRowBefore` took **151 seconds** and with using `->insertNewRowBefore`it took **147 second**. So unfortunatelly I cannot confirm your solution :-( – sumid Nov 28 '13 at 20:14
1

For a XLSX export with columns a - amj (~800) and only ~50 rows I also ran into the 30 second boundary. To test my program, I limited the amount of rows processed to just 7, which worked in 25 sec.

  1. going from individual $objPHPExcel->getActiveSheet() to $sheet (first advice) it actually increased the time on a limited amount of rows from 25 sec to 26 sec.

  2. What really helped me was replacing all my getHighestDataColumn() with a simple $column_nr variable that is incremented in PHP, I went from 26 sec to 7 sec.

After that I was able to process all 50 rows in 11 sec.

Cas Tuyn
  • 11
  • 2
1

I am in no means an expert in using PHPExcel, but the OfficeOpenXML format (the format of *.xlsx files) is itself a group of XML files packed in ZIP archive with *.xlsx extension. If you value your performance and know what kind of data you will be passing, maybe it is a better idea to build own XLSX generator, stripped down to the most important functions, maybe making some calculations on database layer etc. instead of parsing the whole document.

To do it, you can begin with analyzing files generated using smaller data sets (by changing extension from *.xlsx into *.zip, unpacking it and browsing through the contents of the single files). That way you could determine what you really need and generate it yourself (by creating appropriate XML files and packing them into ZIP archive, then renaming to have *.xlsx extension).

There is also specification of OfficeOpenXML, which is large (a couple thousands of pages), thus I do not propose reading it unless you really want to. Creating files to match the way they were generated by PHPExcel should be enough.

The solution mentioned above does not include any PHPExcel-related tips, because I am not an expert in it. I have been previously interested in OOXML standarization process however, and would be happy if knowledge about this standard would help you solve your problem.

Tadeck
  • 132,510
  • 28
  • 152
  • 198
  • that's rather hard core. I actually don't do much in terms of calculations in the file, it's most data presentation with some very minor basic calculations. in the short run, i increased execution time limit, in the long run, might consider a custom engine or moving to csv files. – SaltyNuts May 12 '11 at 20:46
  • 2
    I like how Atlassian JIRA solves it - JIRA just saves an HTML document named as .xls, and Excel is able to open and display it just fine. And rendering HTML is so fast and straight forward. There is one caveat though - such HTML files will make Excel to warn about format mismatch when opening. – JustAMartin Mar 04 '15 at 14:09
1

One performance tip I hadn't seen before relates to adding worksheets – or more specifically, setting the title of a worksheet. If you add many worksheets, the order of operations can have a huge impact. For the following tests, I used a spreadsheet with 120 populated worksheets and timed how long it took to create another 120 empty worksheets.

First, using the steps shown in the docs:

for ($i = 0; $i < 120; $i++) {
    $sheet = $spreadsheet->createSheet();
    $sheet->setTitle('Sheet Title' . $i);
}
// Time: 12.5605s

Second, alternative method from the docs:

for ($i = 0; $i < 120; $i++) {
    $sheet = new Worksheet($spreadsheet, 'Sheet Title' . $i);
    $spreadsheet->addSheet($sheet);
}
// Time: 0.0266s

Most of the performance gap between the above two methods can be closed by using the second parameter of setTitle (if it is safe to do so in your case; see the docs):

for ($i = 0; $i < 120; $i++) {
    $sheet = $spreadsheet->createSheet();
    $sheet->setTitle('Sheet Title' . $i, false);
}
// Time: 0.5793s
aplum
  • 188
  • 1
  • 8
0

I had the exact same issue. Got a 5000 row, 32 column CSV file that took forever to process. It turns out almost all the time spent "processing" is actually the character encoding which is set to encode everything to UTF8 by default. So if you go into your config\excel.php file and scroll down to encoding, just set it as:

/*
|--------------------------------------------------------------------------
| Import encoding
|--------------------------------------------------------------------------
*/
    'encoding' => array(

        'input'  => '',
        'output' => ''

    ),

With this alone - the above mentioned file takes around 8sec to process. You might want to warn your client to save the CSV correctly though.

Skipp
  • 34
  • 1
  • 7
0

In my case I increased performance by changing cache storage method to in memory gzip cache_in_memory_gzip

$cm = \PHPExcel_CachedObjectStorageFactory::cache_in_memory_gzip;
\PHPExcel_Settings::setCacheStorageMethod($cm);
Farid Movsumov
  • 12,350
  • 8
  • 71
  • 97