How to find the last row in a column using openpyxl normal workbook?

Question

I'm using openpyxl to put data validation to all rows that have "Default" in them. But to do that, I need to know how many rows there are.

I know there is a way to do that if I were using Iterable workbook mode, but I also add a new sheet to the workbook and in the iterable mode that is not possible.

Possible duplicate of [Is it possible to get an Excel document's row count without loading the entire document into memory?](http://stackoverflow.com/questions/13377793/is-it-possible-to-get-an-excel-documents-row-count-without-loading-the-entire-d) — Torxed, Nov 05 '15 at 10:27

Charlie Clark · Accepted Answer · 2017-04-06T10:58:05.993

85

ws.max_row will give you the number of rows in a worksheet.

Since version openpyxl 2.4 you can also access individual rows and columns and use their length to answer the question.

len(ws['A'])

Though it's worth noting that for data validation for a single column Excel uses 1:1048576.

edited Apr 06 '17 at 10:58

answered Nov 05 '15 at 11:22

Charlie Clark

18,477
4
49
55

21

However, `ws.max_row` will not check if last rows are empty or not. If cells _content_ at the end of the worksheet is deleted using _Del_ key or by removing duplicates, remaining empty rows at the end of your data will still count as a used row. If you do not want to keep these empty rows, you will have to delete those **entire** rows by selecting rows number on the left of your spreadsheet and deleting them (right click on selected row number(s) -> Delete) – V. Brunelle Apr 05 '17 at 16:50
That isn't the question, though is it? – Charlie Clark Apr 06 '17 at 10:40
2

`max_row` and `max_column` didn't work for `sheet = wb.active`. I am using `openpyxl==2.4.8` – Hussain Apr 09 '18 at 10:18
@Hussain : What didn't work for you, any errors or just an unexpected value? Does it work for you if you explicitly set the sheet by e.g. `wb.worksheets[0]`? Did you end up finding a different solution? – antiplex Aug 29 '18 at 15:13
@V.Brunelle Is there any workaround possible without deleting the entire row ,My issue is that I have some data in that row of some other column.If I delete the entire data the data specific to previous column also gets deleted.Kindly suggest – Chandra Shekhar Nov 27 '19 at 08:21
@ChandraShekhar If you have data in another column of the same row, then what I explained do not apply to you, since it is not effectively the last row of the sheet. – V. Brunelle Nov 27 '19 at 12:08
4

These two methods give me the same result. Am I missing something? `ws.max_row` gives me the last filled row in the entire worksheet. And `len(ws['A'])` should give me the length of that specific column. If longest row is 10 but col 'A' has 3 cells filled, the latter method should give me 3, not 10. – Bn.F76 May 06 '20 at 16:47
2

Why is this the accepted answer ? As other have commented, it is obviously not correct - `len(ws['A'])` returns the same value as `ws.max_row`, the length of the longest column in the entire spreadsheet - and not the number of rows in individual column `A`. – ssc Feb 05 '22 at 08:05
This is the correct answer because worksheets should always be considered to be 16384, 1048576 in size .It's much easier to work with an evenly shaped worksheet initially. You cam work down from that, including defining what counts as an empty cell. – Charlie Clark Aug 11 '23 at 17:11

score 6 · Answer 2 · answered Nov 11 '21 at 13:13

6

This works for me well. It gives number of non empty rows in each column, assuming there are no empty rows in between.

from openpyxl import load_workbook as lw
from openpyxl.utils import get_column_letter

wb = lw(your_xlsx_file)
ws = wb[sheet_name]

for col in range(1, ws.max_column + 1):
    col_letter = get_column_letter(col)
    max_col_row = len([cell for cell in ws[col_letter] if cell.value])
    print("Column: {}, Row numbers: {}".format(col_letter, max_col_row)

answered Nov 11 '21 at 13:13

DannyG

141
1
5

You should probably use an explicit check for `None`. And this will be faster if you work from the end of the worksheet. Also, aliasing `load_workbook? For a single call? Really? – Charlie Clark Aug 11 '23 at 17:13

score 1 · Answer 3 · answered Aug 30 '22 at 12:58

Here is other solution that might be helpful - as openpyxl function max_row and max_column takes into consideration also empty cells with styles applied I think that using pandas is better in that case:

import pandas as pd

def get_max_row_column(df, sheet_name):
    max_row = 1
    max_col = 1
    for sh_name, sh_content in df.items():
        if sh_name == sheet_name:
            max_row = len(sh_content) + 1
            max_col = len(sh_content.columns)
            break
    coordinates = {'max_row': max_row, 'max_col': max_col}
return coordinates

df = pd.read_excel('xls_path', sheet_name=None)
max_row = get_max_row_column(df, 'Test_sheet')['max_row']
max_col = get_max_row_column(df, 'Test_sheet')['max_col']

By providing sheet_name=None I create dictionary of all worksheets where key is sheet name and value sheet content (which is pandas DataFrame de facto).

Damy Pui · Answer 4 · 2023-06-22T06:35:48.383

Here's a function that returns the last filled row based on selected column, it's not take in consideration the length of other columns.

Input parameters:

ws - worksheet

column - selected column

!!! The range can be modified based on the needs

def return_lr(ws,column):
    selected_column = ws[column]
    for x in range(1,100): 
        if selected_column[x].value == None:
           lr = x
           return lr

score 0 · Answer 5 · edited Oct 19 '21 at 14:57

0

Find length of row and length of col.

Column:

column=sheet['A']
output tuple-->(A1,A2,A3........An)

len(column)
output length--> 18

for row length:

for i in sheet.iter_rows(max_row=0):

    print(len(i))

    break

This will give you length of header row where you put feature name . If you wan to get all rows length add max_row=len(column) and remove break.

edited Oct 19 '21 at 14:57

marc_s

732,580
175
1,330
1,459

answered Nov 09 '19 at 17:48

Taimoor Arif

68
3

5

This doesn't work. If col A is filled to row 3 and col B is filled to row 2, both will return 3. – PythonAmateur742 Jun 29 '20 at 14:43
1

Having the same issue right now. cant get the length of the last row as previous rows are longer. – Flying Thunder Dec 08 '21 at 14:43

jun · Answer 6 · 2023-04-23T20:50:33.967

NOTE: THIS METHOD ASSUMES THAT THE COLUMN YOU ARE USING DO NOT HAVE BLANK CELLS IN BETWEEN VALUES

| A       | B      | C     |
|:--------|:-------|:------|
| 10R46   | 1005   | 8017  |
| 10R46   | 10335  | 5019  |
| 100R91  | 1005   | 8017  | 
| 10R91   | 243    | 8870  | 
| 10M95   | 4918   | 8305  |
| 10M95   | 9017   | 8305  |
|         | 9470   | 8221  |

Load it into pandas data frame and count the number of non null values.

import pandas as pd

df_split_file = pd.read_excel('testfile.xlsx', sheet_name='sheet1')
last_row = df_split_file['A'].count() + 1


print(len(last_row))

Result of last_row:

How to find the last row in a column using openpyxl normal workbook?

6 Answers6

Linked

Related