8

We're developing and maintaining a couple of systems, which need to export reports in Excel format to the end user. The reports are gathered from a MySQL database with some trivial processing and usually result in ~40000 rows of data with 10-15 columns, we're expecting the amount of data to grow steadily.

At the moment we're using PHPExcel for the Excel generation, but it's not working for us anymore. After we go above 5000 rows, the memory consumption and loading times become untolerable, and can't be solved by indefinitely increasing PHP's maximum limits for memory usage and script execution times. Processing of the data is as lean as possible, and the entire problem is with PHPExcel being a memory hog. CSV generation would be lighter, but unfortunately we're required to export Excel (and Excel alone) from our services due to user demands. This is due to formatting requirements etc., so CSV isn't an option.

Any ideas/recommendations for a third party application/module/service/what ever for generating large excels? Doesn't matter if it's a commercial licence, as long as it fits our needs, can be integrated to existing PHP applications and does its job. Our services are generally running on linux/php/mysql and we can do just about whatever we need to do with the servers.

Thanks!

Community
  • 1
  • 1
lostcontrol
  • 81
  • 1
  • 1
  • 4
  • I'm too familiar with any other methods to actually generate xls files. As you said, making a csv and then making an xls of that would be quicker :). Did you look into a possible fix for phpExcel (like the discussion going on here: http://stackoverflow.com/questions/4817651/phpexcel-runs-out-of-256-512-and-also-1024mb-of-ram – Nanne May 18 '12 at 08:22
  • Is there a specific reason why you do not want to use CSV (multiple sheets, formatting, etc)? Also do you have to export to xls or xlsx (or it does not matter)? – Bo. May 18 '12 at 08:40
  • Nanne: As far as I know, one of our programmers has more or less tried everything to make PHPExcel run as light as possible. I'll forward your link to him, but I think we've already tried it. Can you recommend any reasonable alternatives to PHPExcel which could do the job better? Bo: User demands (ie. them not knowing how to import CSV properly or not wanting to bother with it), formatting and data integrity, multiple sheets and the lot. XLS exports are the minimum, as we have no control over the version of Excel the end users have, XLSX is not a necessity, but a nice bonus. – lostcontrol May 18 '12 at 08:55
  • The only thing i've come up with in the past, but rejected myself for several reasons (3rd party stuff, bit of a detour really, no experience with the API etc) is pushing it to a google spreadsheat using the API. Not sure if it is feasible at all to easily download it as an XLS, but you can look into it: https://developers.google.com/google-apps/spreadsheets – Nanne May 18 '12 at 09:29

6 Answers6

3

For such a large amount of data I would not recommend tools like PHPExcel or ApachePOI (for Java) because of their memory requirements. I have struggled with similar task recently and I have found convenient (but maybe little bit fiddly) way to inject data into spreadsheets. Serverside generation or updating of Excel spreadsheets can be achieved thus simple XML editing. I have XLSX spreadsheet sitting on the server and every time data is gathered from dB, I unzip it using php. Then I access specific XML files that are holding contents of worksheets that need to be injected and insert data manually. Afterwards, I compress spreadsheet folder in order to distribute it as an regular XLSX file. Whole process is quite fast and reliable. Obviously, there are few issues and glitches related to inner organisation of XLSX/Open XML file (e. g. Excel tend to store all strings in separate table and use references to this table in worksheets). But when injecting only data like numbers and strings, it is not that hard. If anyone is interested, I can provide some code.

Okay, here goes sample code for this. I have tried to comment what it does, but feel free to ask for further explanation.

<?php
/** 
 * Class for serverside spreadsheet data injecting
 * Reqs: unzip.php, zip.php (containing any utility functions able to unzip files & zip folders)
 *
 * Author: Poborak
 */
class DataInjector
{    
    //spreadsheet file, we inject data into this one
    const SPREADSHEET_FILE="datafile.xlsx";   
    // specific worksheet into which data are being injected    
    const SPREADSHEET_WORKSHEET_FILE="/xl/worksheets/sheet7.xml"; 
    //working directory, spreadsheet is extracted here
    const WSPACE_DIR="Wspace";
    // query for obtaining data from DB
    const STORE_QUERY = "SELECT * FROM stores ORDER BY store_number ASC"; 

    private $dbConn;
    private $storesData;

    /**
     * @param   mysqli  $dbConn
     */
    function __construct(mysqli $dbConn) {   
        $this->dbConn = $dbConn;
    }

    /**
     * Main method for whole injection process
     * First data are gathered from DB and spreadsheet is decompressed to workspace.
     * Then injection takes place and spreadsheet is ready to be rebuilt again by zipping.
     *
     * @return   boolean    Informace o úspěchu
     */     
    public function injectData() {

        if (!$this->getStoresInfoFromDB()) return false;        
        if (!$this->explodeSpreadsheet(self::SPREADSHEET_FILE,self::WSPACE_DIR)) return false;                      
        if (!$this->injectDataToSpreadsheet(self::WSPACE_SUBDIR.self::SPREADSHEET_WORKSHEET_FILE)) return false;            
        if (!$this->implodeSpreadsheet(self::SPREADSHEET_FILE,self::WSPACE_DIR)) return false;
        return true;
    }

    /**
     * Decompress spreadsheet file to folder
     *
     * @param   string  $spreadsheet
     * @param   string  $targetFolder
     *
     * @return   boolean    success/fail 
     */   
    private function explodeSpreadsheet($spreadsheet, $targetFolder) {
        return unzip($spreadsheet,$targetFolder);
    }

    /**
     * Compress source folder to spreadsheet file
     *
     * @param   string  $spreadsheet    
     * @param   string  $sourceFolder
     *
     * @return   boolean    success/fail 
     */   
    private function implodeSpreadsheet($spreadsheet, $sourceFolder) {
        return zip($sourceFolder,$spreadsheet);
    }

    /**
     * Loads data from DB to member variable $storesDetails (as array)
     *
     * @return   boolean    success/fail 
     */ 
    private function getStoresInfoFromDb() {
        unset($this->storesData);       

        if ($stmt = $this->dbConn->prepare(self::STORE_QUERY)) {
            $stmt->execute();
            $stmt->bind_result($store_number, $store_regional_manager, $store_manager, $store_city, $store_address);
            while ($stmt->fetch()) {
                $this->storesData[trim($store_number)] = array(trim($store_regional_manager),trim($store_manager),trim($store_address),trim($store_city));
            }           
            $stmt->close();
        }   
        return true;        
    }

    /**
     * Injects data from member variable $storesDetails to spreadsheet $ws
     *
     * @param   string  $ws target worksheet
     *
     * @return   boolean    success/fail
     */ 
    private function injectDataToSpreadsheet($ws) {
         $worksheet = file_get_contents($ws);    
         if ($worksheet === false or empty($this->storesData) return false;

         $xml = simplexml_load_string($worksheet);  
         if (!$xml) return false;

        // Loop through $storesDetails array containing rows of data
        foreach ($this->storesData as $std){

            // For each row of data create new row in excel worksheet
            $newRow = $xml->sheetData->addChild('row'); 

            // Loop through columns values in rowdata
            foreach ($std as $cbd){                      
                // Save each column value into next column in worksheets row 
                 foreach ($this->storesData as $cbd){
                    $newCell = $newRow->addChild('c'); 
                    $newCell->addAttribute('t', "inlineStr");
                    $newIs = $newCell->addChild('is');
                    // text has to be saved as utf-8 (otherwise the spreadsheet file become corrupted)
                    if (!mb_check_encoding($cbd, 'utf-8')) $cbd = iconv("cp1250","utf-8",$cbd); 
                    $newT = $newIs->addChild('t',$cbd);                     
                }
             }
         }

         // Save xml data back to worksheet file
         if (file_put_contents($ws, $xml->asXML()) !== false) return true;           
    }
}
?>   
bazinac
  • 668
  • 5
  • 22
  • Hi Poborak, your solution sounds quite promising and I'm interested in having a look at the code sample involved. Can you supply a download link? – lostcontrol Jun 07 '12 at 10:19
  • I ve inserted code into my original answer. I have simplified class that I use in my project to show basic idea and workflow. You only need to include function for unzipping and folder zipping for this sample. If needed I can provide too... – bazinac Jun 07 '12 at 12:39
  • @Poborak Which XLSX spreadsheet lib you are using, will it work for 2 lac rows? – Ankit Balyan Feb 05 '16 at 09:01
1

The list of alternatives for PHPExcel that I try to keep up to date is here

If you're after raw speed/memory performance above and beyond anything that PHPExcel can offer, then the only one I'd actually recommend is Ilia's wrapper extension for libXL, because the library is still actively supported.

Community
  • 1
  • 1
Mark Baker
  • 209,507
  • 32
  • 346
  • 385
  • Hi Mark, thanks for the list of alternatives. At the moment your recommendation of Ilia's wrapper for libXL seems to be the best option, but we'll have to look into it further. – lostcontrol May 21 '12 at 08:55
0

Did you try out the old Pear Excel (aka Spreadsheet_Excel_Writer: http://pear.php.net/package/Spreadsheet_Excel_Writer/redirected)?

Checkuout discussion regarding Pear Vs PHPExcel:
http://phpexcel.codeplex.com/discussions/240688

Nanne
  • 64,065
  • 16
  • 119
  • 163
Bo.
  • 2,547
  • 3
  • 24
  • 36
  • But he states that CSV is not an option due to (maybe unreasonable, but still in-place) requirements. How then is this an answer? – Nanne May 18 '12 at 08:28
  • @Nanne: it was not clear initially that CSV was not an option. It should be a hint that if two people make similar answer (that might not be correct) that the questions may be not as clear as initially planned. – Bo. May 18 '12 at 09:19
  • The origional question said literally: `CSV generation would be lighter, but unfortunately we're required to export Excel (and Excel alone)`. Seems clear to me. And even if it wasn't clear, there was something wrong with the question, ok, but that does not make this a valid answer to the (maybe unclear) question, now that it _is_ clear. – Nanne May 18 '12 at 09:25
  • @Nanne: I have edited the answer to remove the reference to CSV. :) And what was not clear is "...we're required to export Excel (and Excel alone)..." as excel is not file format but application loading file formats (Excel supports number of formats: xls, xlsx, etc). – Bo. May 18 '12 at 09:30
  • But the preceding part where the word `CSV` is mentioned in a discarding way still is quite clear. Anyway, this is leading to nothing. Not to pick on you or anything, but greetings and signing of posts isn't needed, your name is allready there :) – Nanne May 18 '12 at 09:35
0

You can export in CSV format, Excel can handle that. If you have problems writing the file, you can always loop the results (pagination) and append them to the CSV file

Try to convert afterwards using PHPExcel to .xsl or .odf format, otherwise leave it at CSV.

Nanne
  • 64,065
  • 16
  • 119
  • 163
Boby
  • 826
  • 5
  • 9
  • But he states that CSV is not an option due to (maybe unreasonable, but still in-place) requirements. How then is this an answer? – Nanne May 18 '12 at 08:28
  • I've suggested the CSV as a middle step because it's easy to generate. And then try to convert it into an Excel format. – Boby May 18 '12 at 08:35
  • But you did not include anything about the conversion (let alone the "leave it as CSV" remark). As `PHPExcel` has a problem with big sheets and memory you would run into the same problem, wouldn't you? And if not, you need to add how to convert big `.csv` files to `.xls` for this to be a helpfull answer as far as I'm concerned. – Nanne May 18 '12 at 08:38
  • Hi Boby, we used to deliver the reports in CSV format to the end user, but the problem was that some cell content was not imported correctly by Excel, unless the user used the separate Import-function (note; the CSV's were formed correctly). For example, strings such as "000123123" lost their leading zeros, which lead to data corruption. This was due to Excel deciding that the string was to be interpreted as a number, and the original content (ie. leading zeros) was lost. – lostcontrol May 18 '12 at 08:48
  • This was one of the main reasons we turned to PHPExcel for file generation, as we could then directly set the cell content type. Delivering CSV's wouldn't be an issue, if our end users would accept that they'd have to use the Import-feature - but unfortunately they won't, and we have to deliver Excels. – lostcontrol May 18 '12 at 08:48
  • We haven't tried the CSV -> XLS conversion in PHPExcel yet, but will look in to it further. My concern is that either the time/mem usage for the conversion will end up staying relatively same, or we'd lose control over the cell content types due to using the import/conversion features instead of setting cell content directly. – lostcontrol May 18 '12 at 08:48
  • Bo, that's why I stated that we need to generate Excel (and Excel alone) in my original question :) I'll edit my question to clarify this. – lostcontrol May 18 '12 at 08:59
  • Sorry @lostcontrol but most people who post questions are not aware of benefits of simplicity and try to over-complicate the problem. Based on your comments I understand the background of the problem more clearly and reasoning why CSV is not an option. – Bo. May 18 '12 at 09:13
0

Check out OfficeWriter. We recently specifically improved performance for massive datasets for a Fortune 500 financial company. It does way more with the file format than you specifically need (charts and what have you), but the API is pretty easy to use and with the evaluation you could get a POC up quickly. Disclaimer - I'm on the engineers who built the latest version.

One other downside for you guys is that it's .NET.

Nick Martin
  • 282
  • 1
  • 4
  • Hi Nick, OfficeWriter looks great, but as most of our services run on linux/php, migrating into .net is something we'd have to plan and consider for a longer time and I'm not sure if we can currently devote the resources for that task. For an existing .net environment this seems like a good option. – lostcontrol May 21 '12 at 08:46
-1

What about just printing table?

<?php
header("Content-Type:   application/vnd.ms-excel; charset=utf-8");
header("Content-Disposition: attachment; filename=abc.xls");  //File name extension was wrong
header("Expires: 0");
header("Cache-Control: must-revalidate, post-check=0, pre-check=0");
header("Cache-Control: private",false);

echo "<table><tr><td>Test</td><td>Test2</td></table>";

Excel file reader will open this as a normal sheet. Keep in mind that end result is working, but the actual content is just an HTML table.

FDisk
  • 8,493
  • 2
  • 47
  • 52