2

I have a Pandas data frame with 4 columns. How would I color each cell of some specified columns based on some preset criteria before outputting the data frame to an excel file?

# Example Data frame

df = pd.DataFrame({'A':[1,15,10,47,35],      
    'B':["Mac","Mac","Mac","Mac","Mac"],
    'C':["Dog","Dog","Cat","Dog","Tiger"], 
    'D':["CDN", "USD", "CDN", "Pe", "Dr"]
})

I want to color each element in columns 'B', 'C', 'D' based on the relative frequency of each respective element within the column. For example, the relative frequency of "CDN" in the 'D' column is 2/5 = 0.4.

These are my criteria for the color based on the relative frequency:

Relative frequency Color
Greater than or equal to 0.90 Green
Less than 0.90 and greater than or equal to 0.30 Yellow
Less than 0.30 Red

Since the relative frequency of "CDN" within the 'D' column is 0.4, then, that cell would be assigned a background color of Yellow.

I know how to find the relative frequency of each element within a column and how to develop the conditional for color based on the relative frequency. What I don't know is how to assign a color to each individual cell at a time. I looked at coloring cells in excel with pandas but the problem is that it assigns the background-color to the entire data frame, not a single cell.

This is what the output excel file from the example should look like:

enter image description here

Diamoniner12345
  • 414
  • 2
  • 10

2 Answers2

1

This can be a possible solution:

def highlight_cell(value):
    # check relative occurrence of that value
    relative_occurrence = some_value
    if relative_occurrence >=  0.9:
        return 'background-color: green;'
    elif relative_occurrence < 0.3:
        return 'background-color: red;'
    return 'background-color: yellow;'

df = df.style.applymap(hightlight_cell)

If you want to know more differences between apply, applymap and map check this link Difference between map, applymap and apply methods in Pandas

0

The Openpyxl Python library supports this functionality.

This might help:

from openpyxl.styles import colors
from openpyxl import Workbook

red = PatternFill(start_color='FF0000')
yellow = PatternFill(start_color='FFFF00')
green = PatternFill(start_color='00FF00')    

# Create new active workbook (Excel spreadsheet)
wb = Workbook()
ws = wb.active

# Append your dataframe 'df' to active Workbook
for r in dataframe_to_rows(df, index=True, header=True):
    ws.append(r)

# Fill each cell with proper color
for cell in ws:
    if cell.value < .3:
        cell.fill = red
    elif cell.value >= .9:
        cell.fill = green
    else:
        cell.fill = yellow

# Save as Excel file
wb.save("pandas_openpyxl.xlsx")