2

I want to change the color of some text according to the RGB value provided. I have a dataframe and also have a dictionary containing tokens to be annotated and the intensity to use. Following is an example of such an annotation:

annotations = {'quick': [180, 0, 0], 
               'brown': [200, 0, 0],
               'fox': [250, 0, 0],
               'lazy': [0, 0, 240],
               'dog': [0, 0, 255]}

I want something like the following (the colors here are set manually just for illustration): text marked with color gradient

I have already tried the following. But either they show how to mark the whole cell a single color or the output format is HTML and not a dataframe which can be saved as an excel or a csv file:

Following is the closest implementation I ended up with by modifying one of the aforementioned solutions Python Pandas: Highlight matching text and row:

import re
import pandas as pd
from IPython.display import HTML

def display_highlighted_words(df, keywords):
    head = """
    <table>
        <thead>
            """ + \
            "".join(["<th> %s </th>" % c for c in df.columns])\
            + """
        </thead>
    <tbody>"""

    for i,r in df.iterrows():
        row = "<tr>"
        for c in df.columns:
            matches = []
            for k in keywords:
                for match in re.finditer(k, str(r[c])):
                    matches.append(match)
        
            # reverse sorting
            matches = sorted(matches, key = lambda x: x.start(), reverse=True)
        
            # building HTML row
            cell = str(r[c])
            for match in matches:
                cell = cell[:match.start()] +\
                    "<span style='color:red;'> %s </span>" % cell[match.start():match.end()] +\
                    cell[match.end():]
            row += "<td> %s </td>" % cell
                
            row += "</tr>"
        head += row

    head += "</tbody></table>"
    return head
# display(HTML(head))

df = pd.DataFrame([["Franco color Franco",1], 
                   ["Franco Franco Ciccio Franco",2], 
                   ["Ciccio span",3]], columns=["A", "B"])

print(type(display_highlighted_words(df, ["Franco", "Ciccio"])))
df = pd.read_html(display_highlighted_words(df, ["Franco", "Ciccio"]))
print(type(df[0]))
df[0].to_excel("styled_excel.xlsx", index = False)

But this is not saving the markings while saving it to excel.

hafiz031
  • 2,236
  • 3
  • 26
  • 48
  • 3
    If you need that much control over Excel, then you need to be controlling Excel. The cute pandas export doesn't have that much control. – Tim Roberts May 26 '22 at 03:43

0 Answers0