8

I'm trying to parse data in an excel spreadsheet using XLRD to determine which cell values are italicized. This information will be used to set a flag as to whether the value is an estimated or reported value. Below is an example of the data:

owner_name          year    Jan     Feb     Mar     Apr     May     Jun     Jul     Aug     Sep     Oct     Nov     Dec
Alachua, city of    1978    17.4    15.7    16.7    18.3    18.9    18.9    19.2    17.4    19.5    19.8    17.1    16.4
Archer, city of     1978    5.6      3.6     4.3     4.5     4.7     4.8     5.3     5.3     5.4     5.6     3.9     2.8

I have not used XLRD to any great extent, aside from playing around with some of the basic functions to get a feel for how to pull data from the spreadsheet. Now I need to add that extra bit of functionality to identify italicized cell values.

Thanks in advance for your help...

EDIT: XLRD provided me with the functionality I need; thanks to John Machin for the answer. Here is teh codez:

import xlrd

book = xlrd.open_workbook('fl_data.xls',formatting_info=True)
sh = book.sheet_by_index(0)

for row in range(0,sh.nrows):
    font = book.font_list
    cell_val = sh.cell_value(row,1)
    cell_xf = book.xf_list[sh.cell_xf_index(row,1)]

    print cell_val,font[cell_xf.font_index].italic
Jason Bellino
  • 494
  • 1
  • 6
  • 18
  • I would also be open to using win32com if there is a better way down that route. I've already built the Microsoft Excel library with MakePy.py. – Jason Bellino Jul 18 '11 at 18:27
  • Your solution in the edit works with italics, but it doesn't seem to work with bold and underline (substituting `italic` on the last line of your code for `bold` and `underlined` respectively). Any idea why? – Brōtsyorfuzthrāx Jan 15 '16 at 07:39
  • 1
    Okay, it looks like instead of underlined, I should have used `underline_type`, and instead of `bold`, I should have checked `weight` (normal text is 400, bold is 700). At least, this is with an `.xls` file saved in LibreOffice on Xubuntu. `print(font[cell_xf.font_index].__dict__)` tells what is set to what. Anyway, this still doesn't explain why `bold` and `underlined` don't change, though. – Brōtsyorfuzthrāx Jan 15 '16 at 08:04

2 Answers2

1

My solution here was based on a class written by 'timmorgan' which can be found here. The class requires that the excel document you wish to act upon be open. You then create the excel document object and then call the 'get_range' method which returns a range object. This range object can then be used to get at font properties of the cell specified.

#--Requires excel document to be open
import pyexcel
book = pyexcel.ExcelDocument(visible=True) #--keeps excel open
cell = 'r171'
r = book.get_range(cell)
val = book.get_value(cell)

print val, r.font.italic, r.font.name
Jason Bellino
  • 494
  • 1
  • 6
  • 18
  • Actually, this stopped working for me when I tried to import the XLRD module and use that to get column names. Will update if I can get it working again. – Jason Bellino Jul 18 '11 at 21:06
  • The `book` excel document object continues to have a `get_range` method, but the resulting range object `r` no longer has font methods. `print r` results in the value of the cell though the error message I get says: "AttributeError: '' object has no attribute 'font'" – Jason Bellino Jul 18 '11 at 21:12
1

Using xlrd (by itself, not with pyexcel):

Here is a link to a topic to the python-excel google-group. It's about getting font colour but that gets you 99% of the way.

John Machin
  • 81,303
  • 11
  • 141
  • 189
  • It worked, thank you John. As with Nancy, I was having a bit of trouble fumbling through the manual. Your reply to her got me on the path I needed. For the record I'm posting the code as an edit to my question above. – Jason Bellino Jul 19 '11 at 13:38
  • 2
    Just as a note for those who didn't read the question enough to see the edit (and are furiously trying to figure this out), Jason's solution is in the question. – Brōtsyorfuzthrāx Jan 15 '16 at 07:25