6

I would like to format a spreadsheet (xls or xlsx) so that any cells containing a word or ending with a certain string are formatted by filling the background with a specific color.

For example, if the cell contains the word 'deleted', fill it black and paint the text white. if the cell ends with '.pf', paint the cell red.

I found a similar question from several years ago that suggested the following:

import xlrd 
import xlutils.copy 

inBook = xlrd.open_workbook('input.xls', formatting_info=True) 
outBook = xlutils.copy.copy(inBook) 

def _getOutCell(outSheet, colIndex, rowIndex): 
    """ HACK: Extract the internal xlwt cell representation. """ 
    row = outSheet._Worksheet__rows.get(rowIndex) 
    if not row: return None 
    cell = row._Row__cells.get(colIndex) 
    return cell 

def setOutCell(outSheet, col, row, value): 
    """ Change cell value without changing formatting. """ 
    # HACK to retain cell style. 
    previousCell = _getOutCell(outSheet, col, row) 
    # END HACK, PART I 
    outSheet.write(row, col, value) 
    # HACK, PART II 

    if previousCell: 
        newCell = _getOutCell(outSheet, col, row) 
    if newCell:
        newCell.xf_idx = previousCell.xf_idx 
    # END HACK 


outSheet = outBook.get_sheet(0) 
setOutCell(outSheet, 5, 5, 'Test') 
outBook.save('output.xls') 

While this does copy the values from input.xls to output.xls, this does not seem to transfer the formatting (the test values from input.xls are no longer formatted when opening output.xls, nor are the conditional formatting rules present under "manage rules" in excel.

"if" statements for number values seem to work, but again, I am looking for a way to format cells containing certain strings. thanks!

user2363458
  • 73
  • 1
  • 5
  • Did you try changing the file in-place? You could create a backup of the file before your script runs. – Alfe May 08 '13 at 19:43
  • i dont think editing in place is possible. from what i understand xlutils only allows you to copy and modify the copy. if win32com can do this, that would be a cool direction to try, but i am not aware of that. win32com is also not my preferred tactic because i would like this script to function on machines without excel installed (linux). – user2363458 May 09 '13 at 15:00

1 Answers1

4

Preserve the original input.xls formatting when you open it:

from xlrd import open_workbook

input_wb = open_workbook('input.xls', formatting_info=True)

Create a new workbook based on this template:

from xlutils.copy import copy as copy_workbook

output_wb = copy_workbook(input_wb)

Define some new cell styles:

from xlwt import easyxf

red_background = easyxf("pattern: pattern solid, fore_color red;")
black_with_white_font = easyxf('pattern: pattern solid, fore_color black; font: color-index white, bold on;")

Evaluate and modify your cells:

input_ws = input_wb.sheet_by_name('StackOverflow')
output_ws = output_wb.get_sheet(0)

for rindex in range(0, input_ws.nrows):
   for cindex in range(0, input_ws.ncols):
       input_cell = input_ws.cell(rindex, cindex)
       if input_cell.value[ input_cell.value.rfind('.'): ] == 'pf':
           output_ws.write(rindex, cindex, input_cell.value, red_background)
       elif input_cell.value.find('deleted') >= 0:
           output_ws.write(rindex, cindex, input_cell.value, black_with_white_font)
       else:
           pass  # we don't need to modify it

Save your new workbook

output_wb.save('output.xls')

Using the above example, unmodified cells should have their original formatting intact.

Should you need to alter the cell content AND would like to preserve the original formatting (i.e. NOT use your custom easyxf instance), you may use this snippet:

def changeCell(worksheet, row, col, text):
    """ Changes a worksheet cell text while preserving formatting """
    # Adapted from https://stackoverflow.com/a/7686555/1545769
    previousCell = worksheet._Worksheet__rows.get(row)._Row__cells.get(col)
    worksheet.write(row, col, text)
    newCell = worksheet._Worksheet__rows.get(row)._Row__cells.get(col)
    newCell.xf_idx = previousCell.xf_idx

# ...

changeCell(worksheet_instance, 155, 2, "New Value")

For the comparisons, you can use the string methods find and rfind (which searches from the right). They return the index of the position of the substring within the string. They return -1 if the substring is not found. Ergo, you see above input_cell.value.find('deleted') >= 0 to evaluate whether or not the substring 'deleted' exists. For the .pf comparison, I used rfind as well as something in Python called slicing.

Community
  • 1
  • 1
pztrick
  • 3,741
  • 30
  • 35
  • this results in the following error: `Traceback (most recent call last): File ".\stacksolution.py", line 14, in input_ws = input.sheet_by_name('Sheet1') #eval and modify cells AttributeError: 'builtin_function_or_method object has no attribute 'sheet_by_name'` ...additionally, i am not clear on how you would apply string comparisons into logic. for example, your `if input_cell.value == "logic goes here"` portion.. that is fine if you are saying "<50 or >60, but how do you write it for "ends with .pf" or "contains the word delete somewhere in the string." – user2363458 May 09 '13 at 12:26
  • 1
    For the error, I had typo'd `input` as the Workbook instance name when it was actually `input_wb` (same for `output_wb`) -- be sure to update those two lines. Will update to show conditional expressions in a few. – pztrick May 09 '13 at 13:26
  • this is the best result so far. with the typo correction to input_wb and output_wb, the script is now successfully copying input.xls to output.xls with all the cell contents. it is also preserving some formatting, like font type, font size, font color. what it is **not** doing, is preserving the conditional formatting or my conditional formatting rules (perhaps this is a limitation of xlutils.copy). The logic you provided worked for files containing the word 'delete' somewhere in them, but not '.pf'. if I use the 'delete' statement with '.pf' it should work with a few false positives. thanks! – user2363458 May 09 '13 at 14:56
  • 1
    @user2363458: Yes, there are limitations to the xlrd/xlwt/xlutils library. For instance, I deal with drop-down menus and data validation at work -- these are not persisted after manipulating an XLS document in Python. I have had good luck using an ASP.NET library [GemBox.Spreadsheet](http://www.gemboxsoftware.com/spreadsheet/overview). I'm not a fan of ASP.NET but I guess it makes sense for the best Excel utility to be written on a Microsoft platform. This particular library CAN be deployed to Linux systems using Apache2 Mono. – pztrick May 09 '13 at 15:02
  • @user2363458: You can also search SO for questions in re Python + regex to learn how best to match your `input_cell.value` conditions. – pztrick May 09 '13 at 15:06