Use OpenPyXL to iterate through sheets and cells, and update cells with contantenated string

Question

I would like to use OpenPyXL to search through a workbook, but I'm running into some issues that I'm hoping someone can help with.

Here are a few of the obstacles/to-dos:

I have an unknown number of sheets & cells
I want to search through the workbook and place the sheet names in an array
I want to cycle through each array item and search for cells containing a specific string
I have cells with UNC paths that reference an old server. I need to extract all the text after the server name within the UNC path, update the server name, and contatenate the remaining text back on the server name
e.g. \file-server\blah\blah\blah.xlsx; extract \file-server\; replace with \file-server1\; put remaining blah\blah\blah.xlsx after new name.
Save xlsx document

I'm new to Python, so would someone be able to point me in the right direction? Sample code is appreciated, because all I know how to do at this point is search through a known workbook, with known sheet names, and then print the data. I don't know how to include wildcards when iterating through worksheets & cells.

What I've done to show the contents of the cells:

from openpyxl import load_workbook, worksheet

def main():
    #read workbook to get data
    wb = load_workbook(filename = 'Book1_test.xlsx', use_iterators = True)
    ws = wb.get_sheet_by_name(name = 'Sheet1')
    #ws = wb.worksheets

    #Iterate through worksheet and print cell contents
    for row in ws.iter_rows():
        for cell in row:
            print cell.value

    #Iterate through workbook & print worksheets     
    #for sheet in wb.worksheets:
    #    print sheet


if __name__ == '__main__':
    main()

-----------------------Update-------------------------

I'm able to search through the cells and extract the server name from the cell, but I I'm not able to save the spreadsheet because I'm in read only mode. When I try to switch to optimized_write=True I get the error:

AttributeError: 'ReadOnlyCell' object has no attribute 'upper'

Here's my code:

from openpyxl import load_workbook, worksheet, Workbook

def main():
    #read workbook to get data
    wb = load_workbook(filename = 'Book1_test.xlsx', use_iterators = True)
    ws = wb.get_sheet_by_name(name = 'Sheet1')
    #ws = wb.worksheets

    #Iterate through worksheet and print cell contents
    for row in ws.iter_rows():
        for cell in row:
            cellContent = str(cell.value)
            #Scans the first 14 characters of the string for the server name
            if cellContent[:14] == '\\\\file-server\\':
                #open workbook in write mode?
                wb = Workbook(optimized_write=True)
                ws = wb.create_sheet()

                #update cell content
                ws[cell] = '\\\\file-server1\\' + cellContent[14:]
                print cellContent[:14]

                #save workbooks
                wb.save('Book1_test.xlsx')


if __name__ == '__main__':
    main()

Does anyone know how to update cell contents?

score 4 · Answer 1 · edited May 23 '17 at 12:10

4

Why don't you read the documentation? If you simply open the workbook with no flags you can edit it.

This is a duplicate of OpenPyXL + How can I search for content in a cell in Excel, and if the content matches the search criteria update the content?

edited May 23 '17 at 12:10

Community

1
1

answered Aug 11 '14 at 19:39

Charlie Clark

18,477
4
49
55

1

I'm new to editing excel files with Python, so I was looking through the documentation and posting here as well. I'm able to edit the document, but not existing cells. user3203010 is correct; only new cells can be created, so I'd have to copy the contents of the workbook to a new workbook and change the fields in process, and I don't want to do that. Also, if I open the workbook with no flags I won't be able to iterate through the cells in the sheet; unless you're referring to something else. – Tellander Aug 12 '14 at 19:06
On standard worksheets you can loop through the rows by simply using `ws.rows()`. If you don't mind working with unordered cells and are really only interested in the values then `for _, cell in ws._cells.items()` is what you want. The optimised versions only need to be used when you really only want either only to read or write Excel files. – Charlie Clark Aug 13 '14 at 07:36

score 2 · Accepted Answer · answered Aug 11 '14 at 17:28

I dont think you can update cell contents. You can open a file to read, or open a new file to write to. I think you have to create a new workbook, and every cell that you read, if you choose to not modify it, write it out to your new workbook. In your sample code, you are overwriting wb (used to read) with the wb (used to write). Pull it out of the for loop, assign a different name to it.

score 2 · Answer 3 · edited Apr 11 '16 at 19:20

2

You can update the content in a cell. You need to assign a value:

workBook = load_workbook('example.xlsx')
sheet = workBook.get_sheet_by_name('sheet')

a = sheet.cell(row=i,column=j)

a.value = 'nuevo valor'

and then save:

workBook.save('example.xlsx')

edited Apr 11 '16 at 19:20

Paul Roub

36,322
27
84
93

answered Apr 11 '16 at 19:16

user6189938

21
1

PepeContraPepe TradeMark · Answer 4 · 2017-05-13T11:35:11.193

-1

by rows something like this (as an idea) works:

sheet = wb.create_sheet(index = 1, title = 'Hipster') # name of the obj. sheet

for counter in range(1,11):
    sheet['A'+ str(counter)] = 'Hola'

edited May 13 '17 at 11:35

answered May 13 '17 at 11:23

PepeContraPepe TradeMark

1
4

Use OpenPyXL to iterate through sheets and cells, and update cells with contantenated string

4 Answers4

Linked