2

I have a dataframe that contains numerical data collected over the whole year, which I want to plot as a heatmap over time (similar to Github contributions), where

  • months are plotted along the x-axis
  • days are plotted along the y-axis
  • color of each box/datapoint denotes high/low numerical value: for my example below, blue represents a high number and red represents a low number.

My heatmap/calmap example

I'm using a Python library calmap which generates a calendar heatmaps from Pandas time series data. This is the code I used to create the above heatmap.

days = df_activities.loc[:,'Date'].to_numpy()  
events = pd.Series(df_activities.loc[:, 'ActivityLevel'].to_numpy(), index=days)
calmap.yearplot(events, cmap='RdYlBu', linewidth=0.3)
calmap.plt.title("Activity Data across 2022")

where a snippet of df_activites looks like this:

df_activites[['Date', 'ActivityLevel']].head(10)

    Date            ActivityLevel
0   2022-01-01      5.733
1   2022-01-02      1.317
2   2022-01-03      5.150
3   2022-01-04      7.283
4   2022-01-05      6.450
5   2022-01-06      8.933
6   2022-01-07      7.333
7   2022-01-08      8.483
8   2022-01-09      6.417
9   2022-01-10      5.517

Question: I want to create a customized legend to label my heatmap and show that blue corresponds to high activity level and red corresponds to low activity level. As I'm using calmap and not using seaborn, what would be the best way to achieve this? Any help is greatly appreciated!

It could be something as simple as below legend, where the colors/labels are replaced with my dataset.

Sample legend

pinguino
  • 23
  • 6

1 Answers1

0

First, you can create a custom discrete colormap similar to this example.

Second, you can create a series of empty plots to hold labels for the legend (supposed that you already have a list of labels corresponding to every color).

Finally, bbox_to_anchor could be used to put legend outside of the plot. More details about this parameter can be found in this excellent answer.

Hope it helps.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import calmap
from matplotlib import colors

# Generating sample data
all_days = pd.date_range('01.01.2022', periods=700, freq='D')
days = np.random.choice(all_days, 500)
events = pd.Series(np.random.randn(len(days)), index=days)

# The colors and labels could be used from elsewhere as long as they are of the same size
cm_colors = ['blue', 'green', 'red']
cm_labels = ['Low', 'Medium', 'High']

cmap = colors.ListedColormap(cm_colors)

fig, ax = plt.subplots(figsize=(10, 3))
calmap.yearplot(events, cmap=cmap, vmin=events.min(), vmax=events.max(), linewidth=0.3, ax=ax)

for i in range(len(cm_colors)):
    ax.plot([],[],c=cmap.colors[i],label=cm_labels[i])

ax.legend(bbox_to_anchor=(1.04,0.5), loc='center left')
plt.title('Activity data across 2022')
fig.tight_layout()

plt.show()

The result is: Sample result

In addition, calmap.yearplot source could be found here.

dimnnv
  • 678
  • 3
  • 8
  • 21
  • I'm looking to create a custom legend rather than a colormap here. I think your code rather just selects distinct colors for the heatmap but doesn't create any legends. Please see the last section of my question, thanks! – pinguino Mar 11 '23 at 09:22
  • @pinguino This is actually more interesting problem, I have updated my solution. Looks like there is no built-in way to achieve this but you could use tricks like I'm suggesting. – dimnnv Mar 11 '23 at 14:50
  • The trick in your updated solution works for my problem, thanks for your help! – pinguino Mar 19 '23 at 14:21