11

When importing a Google Sheets-exported .xlsx file in PHPExcel, PHPexcel methods $worksheet->getHighestDataColumn() and $worksheet->getHighestDataRow() both return the maximum set in Google Docs, including completely empty cells, which is usually 26x1000.

The cells are just as empty as regular empty Excel cells as far as I can tell, when opening them in Google Sheets/Excel, i.e. no spaces.

Maybe Google Sheets uses a slightly different approach to "reserving" columns and rows than Excel (which generates them on the fly).

I would like to get the highest non-empty row, so that I don't allocate lots of memory unnecessarily when looping over rows. Respecting the format that I'm using, I can save some evaluation in empty rows, but in the end, I'm forced to simply loop over all rows and checking whether all cells are empty.

Ruben
  • 3,452
  • 31
  • 47
  • NULL and empty string (`""`) are both valid values for an MS Excel cell, which is why PHPExcel's `getHighestDataColumn()` method recognises them – Mark Baker Nov 10 '15 at 14:53
  • 1
    But the only way to identify cells comtaining those values is to load the worksheet, and check the cell content, so I can't see any practical way of saving memory, other than by modifying the Reader to test the value and discard cells that are NULL or "" – Mark Baker Nov 10 '15 at 14:54
  • @MarkBaker Thanks for the quick response. It's just that sheets imported from Google apparently work slightly differently than sheets from Excel with regard to how an empty (completely untouched, auto-generated cell) is 'reserved'. Maybe, if Google Sheets become more common, a `getHighestNonemptyColumn` method would be desirable (but I've got this covered myself for my specific case). – Ruben Nov 10 '15 at 15:42
  • From what I can see from the sources of the `PHPOffice/PhpSpreadsheet` on github [Cells class](https://github.com/PHPOffice/PhpSpreadsheet/blob/8380fb3ad28242093c18d0baa9b7e67fb429ea84/src/PhpSpreadsheet/Collection/Cells.php#L169) internally is looping over every cell anyway – Michal Bieda Mar 29 '18 at 06:11
  • 1
    Unfortunately, PHPExcel always loads the whole xlsx file in the memory (and it's pretty consuming) as of the xlsx format pecularities, so, I would suggest to move to .csv as it's integrated in PHP and you can easily implement lazy and optimized handling. – Dmitry Nevzorov May 20 '18 at 19:23

1 Answers1

0

If you're comfortable creating a custom function in Google Apps Script perhaps take a look at this answer. Use this before export so you know how many rows are coming in. It finds the last row of a single column. However, as it's based on speed you might find it traverses each column more quickly than other methods.

For your script try Math.max on a forEach loop traversing the results of myLastRow():

Determining the last row in a single column

I note you do not specify if your columns have empty cells before they end. The above link does require all cells to be non null (in Google Sheet terms).

DeeKay789
  • 353
  • 2
  • 4
  • 8