iterate through all rows in specific column openpyxl

Question

I cannot figure out how to iterate through all rows in a specified column with openpyxl.

I want to print all of the cell values for all rows in column "C"

Right now I have:

from openpyxl import workbook
path = 'C:/workbook.xlsx'
wb = load_workbook(filename = path)
ws=wb.get_sheet_by_name('Sheet3')

for row in ws.iter_rows():
    for cell in row:
        if column == 'C':
            print cell.value

What's `ws`? How did you use `openpyxl`? Please give some more details about the goal you're trying to achive or else every answer will be based over assumptions. — danielhadar, Jul 27 '16 at 17:22

score 76 · Answer 1 · answered Aug 21 '17 at 11:44

76

Why can't you just iterate over column 'C' (version 2.4.7):

for cell in ws['C']:
   print cell.value

answered Aug 21 '17 at 11:44

Jonathan Koren

883
7
9

score 53 · Accepted Answer · edited Feb 01 '23 at 21:41

53

You can specify a range to iterate over with ws.iter_rows():

import openpyxl

wb = openpyxl.load_workbook('C:/workbook.xlsx')
ws = wb['Sheet3']
for row in ws.iter_rows('C{}:C{}'.format(ws.min_row,ws.max_row)):
    for cell in row:
        print cell.value

Edit: per your comment you want the cell values in a list:

import openpyxl

wb = openpyxl.load_workbook('c:/_twd/2016-06-23_xlrd_xlwt/input.xlsx')
ws = wb.get_sheet_by_name('Sheet1')
mylist = []
for row in ws.iter_rows('A{}:A{}'.format(ws.min_row,ws.max_row)):
    for cell in row:
        mylist.append(cell.value)
print mylist

edited Feb 01 '23 at 21:41

scotscotmcc

2,719
1
6
29

answered Jul 27 '16 at 17:26

mechanical_meat

163,903
24
228
223

1

Is string argument for `ws.iter_rows()` still valid? I needed to use `ws.iter_rows(min_row=ws.min_row, max_row=ws.max_row)` – xtian Nov 07 '20 at 18:11
With python3, I need to use `for row in ws.iter_rows(ws.min_row,ws.max_row)` or `for row in ws` to make it work. – Larry Guo Jul 23 '21 at 09:51

score 12 · Answer 3 · answered Jan 29 '19 at 21:08

12

You can also do this.

for row in ws.iter_rows():
   print(row[2].value)

With this you are still iterating through the rows (but not the cells) and only pulling the values from column C in the row to print.

answered Jan 29 '19 at 21:08

Sirsmorgasboard

121
1
2

Ali Sajjad · Answer 4 · 2020-07-13T18:08:59.153

Some of the solutions above don't quite work very well (maybe because of latest version of 'openpyxl'). After trying out different things, I used this:

Printing all rows with all columns:

import openpyxl

sheet = openpyxl.load_workbook('myworkbook.xlsx')['Sheet1']
# Iterating through All rows with all columns...
for i in range(1, sheet.max_row+1):
    row = [cell.value for cell in sheet[i]] # sheet[n] gives nth row (list of cells)
    print(row) # list of cell values of this row

Printing all rows with specific columns (e.g. 'E' to 'L'):

# For example we need column 'E' to column 'L'
start_col = 4 # 'E' column index
end_col = 11 # 'L' column index
for i in range(1, sheet.max_row+1):
    row = [cell.value for cell in sheet[i][start_col:end_col+1]]
    print(row) # list of cell values of this row

Please keep these points in mind:

sheet[N] gives the list of 'Cell' objects of Nth row. (N is a number starting from 1)
To get the first column cell of a row, use sheet[N][0]. (Because sheet[N] is a 'tuple' which can be indexed starting from zero 0).

Javier Espinoza · Answer 5 · 2021-06-20T03:53:29.920

2

it can go as:

import openpyxl
path = 'C:/workbook.xlsx'
# since is a print, read_only is useful for making it faster.
wb = openpyxl.load_workbook(filename = path, read_only=True)
# by sheet name 
ws=wb['Sheet3']

# non-Excel notation is col 'A' = 1, col 'B' = 2, col 'C' = 3.
# from row = 1 (openpyxl sheets starts at 1, not 0) to no max
for row in ws.iter_cols(min_row=1, min_col=3, max_col=3): 
    # for each row there is one cell object (since min_col = max_col)
    for cell in row:
        # so we print the value
        print(f'C{row}: ', cell.value)

edited Jun 20 '21 at 03:53

answered Jun 20 '21 at 03:46

Javier Espinoza

44
5

I find this answer the most useful for me. Good explanation. You can also use the combination of `ws.min_row` and `ws.max_row` to scan all used rows (i.e. not the whole worksheet) keeping column 3 fixed (i.e. C) in the iterator, for example: `ws.iter_rows(min_row=ws.min_row, max_row=ws.max_row, min_col=3, max_col=3)` – Domenico Spidy Tamburro Oct 29 '22 at 10:06

score 1 · Answer 6 · answered Jan 02 '21 at 02:33

1

I do it like this. I'm not sure what I'm doing but it does avoid the cells with no values.

from openpyxl import load_workbook
wb = load_workbook(filename = 'exelfile.xlsx')
ws = wb['sheet1']

for col in ws['A']:
    print (col.value)

answered Jan 02 '21 at 02:33

Ishty

11
1

score 1 · Answer 7 · answered Feb 03 '21 at 00:27

You can use coordinate property in cell object.

coordinate property contains cell address in string format.

For example,

from openpyxl import workbook
path = 'C:/workbook.xlsx'
wb = load_workbook(filename = path)
ws=wb.get_sheet_by_name('Sheet3')

for row in ws.iter_rows():
    for cell in row:
        if 'C' in cell.coordinate:
            print cell.value

score -1 · Answer 8 · edited Oct 22 '19 at 18:00

-1

listaClientes =[]
    for row in datos.iter_rows(min_row=2, min_col=3, max_col=3):
        for cell in row:
            listaClientes.append(cell.value)

edited Oct 22 '19 at 18:00

Jason Aller

3,541
28
38
38

answered Oct 22 '19 at 16:53

Marti Falcon Padro

11

iterate through all rows in specific column openpyxl

8 Answers8

Printing all rows with all columns:

Printing all rows with specific columns (e.g. 'E' to 'L'):

Linked