0

I'm plotting some numbers from my pandas dataframes df1 and df2 using seaborn and subplots, and would like to add an annotated line showing the median value for a column from each dataframe on the subplots.

I am able to get a line drawn on the median using axvline in the following code, but not sure how to annotate the drawn line to display the actual median value:

df1median = df1['values'].median()
df2median = df2['values'].median()

fig, axes = plt.subplots(1,2)
sns.kdeplot(data=df1, x='values', ax=axes[0])
axes[0].axvline(df1median)

sns.kdeplot(data=df2, x='values', ax=axes[1])
axes[1].axvline(df2median)

I can annotate manually by using plt.text and setting the position manually, but would like a way to annotate the drawn axvline on each subplot directly. Is this possible?

  • Have you tried adding a ‘label’ value to axvline? It will probably show with the legend. – ai2ys Jan 05 '22 at 13:47
  • See also [How to plot a mean line on a distplot between 0 and the y value of the mean?](https://stackoverflow.com/a/63309583/12046409) about using a fill color to show the median together with the quartiles. – JohanC Jan 05 '22 at 15:06

1 Answers1

1

You can use the x-axis transform to give an x-position in data coordinates and a y-position in axes coordinates (0 at the bottom and 1 at the top of the subplot):

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

df1 = pd.DataFrame({'values': np.random.normal(.1, 2, 500).cumsum()})
df2 = pd.DataFrame({'values': np.random.normal(.1, 3, 1000).cumsum()})

fig, axes = plt.subplots(ncols=2, figsize=(12, 4))
for df, ax in zip([df1, df2], axes):
    sns.kdeplot(data=df, x='values', ax=ax)
    dfmedian = df['values'].median()
    ax.axvline(dfmedian, color='r', ls=':')
    ax.text(dfmedian, 0.99, 'median', color='r', ha='right', va='top', rotation=90,
            transform=ax.get_xaxis_transform())
plt.tight_layout()
plt.show()

sns.kdeplot annotating the median

JohanC
  • 71,591
  • 8
  • 33
  • 66
  • I am a bit confused: since "0.99" is used for the y location, why the transform on ax.get_y_axis_transform()? – VeritatemAmo Aug 24 '23 at 18:15
  • 1
    `0.99` is measured in [axes coordinates](https://matplotlib.org/stable/tutorials/advanced/transforms_tutorial.html), so leaving a small margin from the top border. With `1.0` it would be touching the border, making the text harder to read. – JohanC Aug 24 '23 at 18:42
  • Thanks! I actually meant why using ax.get_x_axis_transform() instead of ax.get_y_axis_transform(). Also, just throw it out there, what if there are a couple axvline that are close to each other and need the texts to avoid touching (e.g., some lines have text on the left, and some with text on the right, or avoiding touching by changing y coordinates). I am running into a similar problem that largely overlaps with the OP's problem, so I don't want to ask in a separate post. – VeritatemAmo Aug 28 '23 at 20:54
  • 1
    Only `get_x_axis_transform()` can be used to align text near the top or bottom spines while having an x-position in data coordinates. If you have multiple close-to-each-other lines, you need to write some test to avoid overlapping, there are no automatic solutions. – JohanC Aug 28 '23 at 22:22