0

I have some DataFrame:

fake_data = {'columnA': ['XYVA', 'YXYX', 'XAVY', 'XAVY', 'XAAY', 'AXAV', 'AXYV', 'AXXV', 'AXXV', 'AXXV', 'AXXV']}
df = pd.DataFrame(fake_data, columns = ['columnA'])
df

I can color the cells by frequency of each character at each position (Count the frequency of characters at a position in a string in a Pandas DataFrame column):

new_data = df.columnA.str.split('', n = 4, expand=True).drop(0, axis=1)
stats = new_data.apply(pd.Series.value_counts)
stats = stats.apply(lambda x: x.div(x.sum())*100).round(1).fillna(0)
stats.style.background_gradient(cmap='Greys', axis=None)

Which returns:

enter image description here

Now I'm trying to remove the numerical values from the cells (leaving color only) and denote these values instead with a colorbar.

Cactus Philosopher
  • 804
  • 2
  • 12
  • 25
  • 1
    I you could colorize the numbers in the same color as the background, such that they appear hidden. As for a colorbar, that's hardly possible with the DataFrame.style. Have a look at [annotated heatmaps](https://matplotlib.org/3.1.1/gallery/images_contours_and_fields/image_annotated_heatmap.html) (and possibly just leave out the annotations). – ImportanceOfBeingErnest Aug 31 '19 at 20:51
  • It maybe easier for you with `plt.imshow()`. – Quang Hoang Sep 01 '19 at 17:34

1 Answers1

0

As ImportanceOfBeingErnest commented, it looks like you are trying to wrestle the DataFrame styling system into giving you a heatmap. You'd probably be better off creating an actual heatmap visualization with one of the many plotting libraries available for Python.

Here's an example with my favorite - Altair:

import pandas as pd
import altair as alt

fake_data = {'columnA': ['XYVA', 'YXYX', 'XAVY', 'XAVY', 'XAAY', 'AXAV', 'AXYV', 'AXXV', 'AXXV', 'AXXV', 'AXXV']}
df = pd.DataFrame(fake_data, columns = ['columnA'])

new_data = df.columnA.str.split('', n = 4, expand=True).drop(0, axis=1)
stats = new_data.apply(pd.Series.value_counts)
stats = stats.apply(lambda x: x.div(x.sum())*100).round(1).fillna(0)

alt.Chart(
    stats.unstack().reset_index().rename(columns={"level_0": "position", "level_1":"character", 0: "count_fraction"}),
    height=150,
    width=150
).mark_rect(
).encode(
    x='position:O',
    y='character:O',
    color=alt.Color('count_fraction:Q', scale=alt.Scale(scheme='greys'))
)

heatmap_example

foglerit
  • 7,792
  • 8
  • 44
  • 64