I have a Pandas data frame with 4 columns. How would I color each cell of some specified columns based on some preset criteria before outputting the data frame to an excel file?
# Example Data frame
df = pd.DataFrame({'A':[1,15,10,47,35],
'B':["Mac","Mac","Mac","Mac","Mac"],
'C':["Dog","Dog","Cat","Dog","Tiger"],
'D':["CDN", "USD", "CDN", "Pe", "Dr"]
})
I want to color each element in columns 'B', 'C', 'D' based on the relative frequency of each respective element within the column. For example, the relative frequency of "CDN" in the 'D' column is 2/5 = 0.4.
These are my criteria for the color based on the relative frequency:
Relative frequency | Color |
---|---|
Greater than or equal to 0.90 | Green |
Less than 0.90 and greater than or equal to 0.30 | Yellow |
Less than 0.30 | Red |
Since the relative frequency of "CDN" within the 'D' column is 0.4, then, that cell would be assigned a background color of Yellow.
I know how to find the relative frequency of each element within a column and how to develop the conditional for color based on the relative frequency. What I don't know is how to assign a color to each individual cell at a time. I looked at coloring cells in excel with pandas but the problem is that it assigns the background-color to the entire data frame, not a single cell.
This is what the output excel file from the example should look like: