2

I plotted a heatmap using seaborn as shown below. I want to change the highlighted column into "Thousands K and Million M" values (shown in the table below). I tried it doing it but the column is changing into string and giving me an error when I am trying to plot those values on the heatmap.

Is there a way I can change the values of first column to the desired values on the heatmap?

enter image description here

Values Desired Values
662183343.70 662.83M
155554910.90 155.55M

Code used for creating the heatmap

sns.heatmap(heatmap_df, cmap='rocket_r', annot=True, fmt='.4f', linewidths=2, 
                     cbar_kws={'label': 'Percentiles', 'orientation': 'vertical', },
                     vmin=0.95, vmax=1, xticklabels=xticks_label, ) 

Changing the column into K, M format

# convert zeros to K, M etc.
from numerize import numerize as nz

df['PropDmgAdj'] = df['PropDmgAdj'].apply(nz.numerize)
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Javed Ali
  • 139
  • 1
  • 10

2 Answers2

2
  • sns.heatmap will throw a ValueError if trying to plot a DataFrame with strings. Therefore, create a custom DataFrame, annot, to pass to the annot= parameter.
  • Format the numbers in annot with .round(4), because fmt='' must be used in the plot call.
  • Tested in python 3.10, pandas 1.4.3, matplotlib 3.5.1, seaborn 0.11.2
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

# sample data
np.random.seed(2022)
data = np.random.random(size=(11, 5))
data[:, 0] = data[:, 0] * 10_000_000

# sample dataframe
df = pd.DataFrame(data)

# create a dataframe to use for annotations
annot = df.copy()

# format the desired column; 0 here is the name of the column
annot[0] = annot[0].div(1000000).round(4).astype(str) + 'M'

# format the rest of the numbers in the dataframe
annot = annot.round(4)

# plot
fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(data=df, annot=annot, fmt='', vmin=0, vmax=1, linewidths=2,
            cbar_kws={'label': 'Percentiles', 'orientation': 'vertical'}, ax=ax)

enter image description here

Data Views

data

np.array([[9.35861381e+04, 4.99057811e-01, 1.13383690e-01, 4.99740182e-02, 6.85407594e-01],
          [4.86988068e+06, 8.97657226e-01, 6.47452071e-01, 8.96963123e-01, 7.21134929e-01],
          [8.31353421e+06, 8.27568069e-01, 8.33579584e-01, 9.57044336e-01, 3.68044437e-01],
          [4.94837630e+06, 3.39509475e-01, 6.19429326e-01, 9.77529638e-01, 9.64330776e-02],
          [7.44206212e+06, 2.92499474e-01, 2.98675351e-01, 7.52473473e-01, 1.86637277e-02],
          [5.23737436e+06, 8.64435847e-01, 3.88842840e-01, 2.12191849e-01, 4.75180704e-01],
          [5.64672418e+06, 3.49429296e-01, 9.75908627e-01, 3.78200437e-02, 7.94269686e-01],
          [3.57882602e+06, 7.47963953e-01, 9.14509307e-01, 3.72662424e-01, 9.64883473e-01],
          [8.13857731e+05, 4.24509911e-02, 2.96796033e-01, 3.63703625e-01, 4.90255176e-01],
          [6.68518738e+06, 6.73414630e-01, 5.72100640e-01, 8.05922429e-02, 8.98331264e-01],
          [3.83885272e+05, 7.82194421e-01, 3.66563567e-02, 2.67183848e-01, 2.05223845e-01]])

df

               0         1         2         3         4
0   9.358614e+04  0.499058  0.113384  0.049974  0.685408
1   4.869881e+06  0.897657  0.647452  0.896963  0.721135
2   8.313534e+06  0.827568  0.833580  0.957044  0.368044
3   4.948376e+06  0.339509  0.619429  0.977530  0.096433
4   7.442062e+06  0.292499  0.298675  0.752473  0.018664
5   5.237374e+06  0.864436  0.388843  0.212192  0.475181
6   5.646724e+06  0.349429  0.975909  0.037820  0.794270
7   3.578826e+06  0.747964  0.914509  0.372662  0.964883
8   8.138577e+05  0.042451  0.296796  0.363704  0.490255
9   6.685187e+06  0.673415  0.572101  0.080592  0.898331
10  3.838853e+05  0.782194  0.036656  0.267184  0.205224

annot

          0       1       2       3       4
0   0.0936M  0.4991  0.1134  0.0500  0.6854
1   4.8699M  0.8977  0.6475  0.8970  0.7211
2   8.3135M  0.8276  0.8336  0.9570  0.3680
3   4.9484M  0.3395  0.6194  0.9775  0.0964
4   7.4421M  0.2925  0.2987  0.7525  0.0187
5   5.2374M  0.8644  0.3888  0.2122  0.4752
6   5.6467M  0.3494  0.9759  0.0378  0.7943
7   3.5788M  0.7480  0.9145  0.3727  0.9649
8   0.8139M  0.0425  0.2968  0.3637  0.4903
9   6.6852M  0.6734  0.5721  0.0806  0.8983
10  0.3839M  0.7822  0.0367  0.2672  0.2052
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
2

Using the lesser known texts attribute of matplotlib.Axes object:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

heatmap_df = pd.DataFrame(
    [
        {"colA": 662183343.70, "colB": 0.9976, "colC": 0.9962},
        {"colA": 155567736.90, "colB": 1.0000, "colC": 1.0000},
        {"colA": 77777.70, "colB": 0.9976, "colC": 0.9962},
        {"colA": 14456.20, "colB": 0.1243, "colC": 0.5356},
    ]
)
heatmap = sns.heatmap(
    heatmap_df,
    cmap="rocket_r",
    annot=True,
    fmt=".4f",
    linewidths=2,
    cbar_kws={"label": "Percentiles", "orientation": "vertical",},
    vmin=0.95,
    vmax=1,
)

# https://stackoverflow.com/questions/37602885/adding-units-to-heatmap-annotation-in-seaborn
for t in heatmap.texts:
    current_text = t.get_text()
    
    # https://stackoverflow.com/questions/67629794/pandas-format-large-numbers
    text_transform = (
        lambda x: f"{x//1000000000}B"
        if x / 1000000000 >= 1
        else f"{x//1000000}M"
        if x / 1000000 >= 1
        else f"{int(x//1000)}K"
        if x / 10000 >= 1
        else f"{x}"
    )
    t.set_text(text_transform(float(current_text)))


plt.show()

The resultant heatmap

mahieyin-rahmun
  • 1,486
  • 1
  • 12
  • 20