So My application has to find the value and the color of the text inside the cell. I can find the background color of the text. For example I have a cell value of "Hello my name is ... nice to meet you" (bold text means the color is red) so when I find the color and see red text I want to remove it I have an idea on how to remove it but no idea on how to find the TEXT color if the cell value.
1 Answers
Surprisingly, this is not as straightforward as it might be but if you follow the logic it makes sense for standard text colour formatting.
To identify the font colour value, you need to run through the cell properties:
cell will return a Cell object
cell.font will return a Font object
cell.font.color will return a Color object
Below is the REPL output from an existing workbook as an example:
>> from openpyxl import load_workbook
>> from openpyxl.styles import Font
>> wb = load_workbook('example.xlsx')
>> sheet0 = wb['FEB 2019']
>>> print(sheet0['C20'].font)
<openpyxl.styles.fonts.Font object>
Parameters:
name='Calibri', charset=None, family=None, b=False, i=False, strike=False, outline=None, shadow=None, condense=None, color=<openpyxl.styles.colors.Color object>
Parameters:
rgb='00ffffff', indexed=None, auto=None, theme=None, tint=0.0, type='rgb', extend=None, sz=11.0, u=None, vertAlign=None, scheme=None
>>> print(sheet0['A1'].font.color)
<openpyxl.styles.colors.Color object>
Parameters:
rgb='00ffffff', indexed=None, auto=None, theme=None, tint=0.0, type='rgb'
The important point to note in this example is the colour is specified in rgb (not sure if it can be outputted in CMYK or another option)
This leads to the object attribute
cell.font.color.rgb
as in
>>> print(sheet0['A1'].font.color.rgb)
00ffffff
If the cell content is rich text, openpyxl cannot process this. Turns out to apply colour to part of a cell requires rich text formatting. See https://bitbucket.org/openpyxl/openpyxl/issues/20/richtext-request for the known issue in the library.
It can be done in xlrd, but only with the old xls format, not the modern xlsx. See this stackoverflow answer: How do I find the formatting for a subset of text in an Excel document cell (it's python2 code but easy to convert to python3) (see below - I've also added the line print('colour:', segment['font'].colour_index)
to show what the particular colour is.
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import xlrd
# accessing Column 'C' in this example
COL_IDX = 2
book = xlrd.open_workbook('example.xls', formatting_info=True)
first_sheet = book.sheet_by_index(0)
for row_idx in range(first_sheet.nrows):
text_cell = first_sheet.cell_value(row_idx, COL_IDX)
text_cell_xf = book.xf_list[first_sheet.cell_xf_index(row_idx, COL_IDX)]
# skip rows where cell is empty
if not text_cell:
continue
print(text_cell)
text_cell_runlist = first_sheet.rich_text_runlist_map.get((row_idx, COL_IDX))
if text_cell_runlist:
print('(cell multi style) SEGMENTS:')
segments = []
for segment_idx in range(len(text_cell_runlist)):
start = text_cell_runlist[segment_idx][0]
# the last segment starts at given 'start' and ends at the end of the string
end = None
if segment_idx != len(text_cell_runlist) - 1:
end = text_cell_runlist[segment_idx + 1][0]
segment_text = text_cell[start:end]
segments.append({
'text': segment_text,
'font': book.font_list[text_cell_runlist[segment_idx][1]]
})
# segments did not start at beginning, assume cell starts with text styled as the cell
if text_cell_runlist[0][0] != 0:
segments.insert(0, {
'text': text_cell[:text_cell_runlist[0][0]],
'font': book.font_list[text_cell_xf.font_index]
})
for segment in segments:
print(segment['text'])
print('italic:', segment['font'].italic)
print('bold:', segment['font'].bold)
print('colour:', segment['font'].colour_index)
else:
print('(cell single style)')
print('italic:', book.font_list[text_cell_xf.font_index].italic)
print('bold:', book.font_list[text_cell_xf.font_index].bold)
print('colour:', book.font_list[text_cell_xf.font_index].colour_index)
Converting your example.xlsx to example.xls, for cell C24 this gives the following output:
Person O, £216 GARDEN
(cell multi style) SEGMENTS:
Person O, £216
italic: 0
bold: 1
colour: 8
GARDEN
italic: 0
bold: 1
colour: 10
The italic/bold 0/1 is probably a boolean (1 is True).
colour depends on the colour map stored in the workbooks (as users can define their own colour maps: https://xlrd.readthedocs.io/en/latest/formatting.html#palette). 0-7 is system-fixed, 8+ is user-defined. 8 defaults to user-defined black, 10 defaults to user-defined red.

- 2,914
- 2
- 14
- 26
-
when I do cell.font i get the information about the font of the cell like size, font family, outline, shadow but the color is None so when I try cell.font.color i naturally get None hence .rgb doesnt work but the problem is when I look at the spreadsheet and the cell i am looking at in the code the cell value is "Some text(that bit is in black), 123123123 sdasd@asdad.com (this bit is in red)" so the problem is it cant pick the color I tried this in a all black cell to it still did not give me the desired out put of black or the hex value of it. I can provide code if needed – KillerKingTR Oct 21 '19 at 00:36
-
I can share the google drive link with you. I have reduced the information on the file greatly and I have altered with the names but the idea is still there. [link](https://drive.google.com/file/d/1ix0wzWHsCp0VVuX5vR73Dg8yolvaaPmw/view?usp=sharing) For example look at the cell with Michael Jackson – KillerKingTR Oct 22 '19 at 17:15
-
Thanks a lot for your help I will look at it when I am free but I am sure it works if not I will handle the problem In a different way. Thanks – KillerKingTR Oct 22 '19 at 18:40