2

I'm using Seaborn to plot a cumulative distribution and it's KDE using this code:

sns.distplot(values, bins=20, 
             hist_kws= {'cumulative': True}, 
             kde_kws= {'cumulative': True} )

This gives me the following chart:

Cumulative distribution and kde

I'd like to plot a vertical line and the corresponding x index where y is 0.8. Something like:

KDE with vertical line

How do I get the x value of a specific y?

neves
  • 33,186
  • 27
  • 159
  • 192

2 Answers2

5

You could draw a vertical line at the 80% quantile:

import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns

values = np.random.normal(1, 20, 1000)
sns.distplot(values, bins=20,
             hist_kws= {'cumulative': True},
             kde_kws= {'cumulative': True} )
plt.axvline(np.quantile(values, 0.8), color='r')
plt.show()

example plot

JohanC
  • 71,591
  • 8
  • 33
  • 66
  • Nice and clear solution. You can annotate the line with the label: `line = ax.axvline(x, color='r') ax.annotate(f'{x:.0f}', xy=(x,0), xytext=(0,-14), color=line.get_color(), xycoords = ax.get_xaxis_transform(), textcoords="offset points", size=12, ha="center")` – neves Jul 20 '20 at 14:56
2

The answer by @JohanC is probably the best. I went an other route and it's maybe a slightly more generic solution.

The idea is to get the coordinates of the kde line, then find the indice of the point where it crosses the threshold value

values = np.random.normal(size=(100,))
fig = plt.figure()
ax = sns.distplot(values, bins=20, 
             hist_kws= {'cumulative': True}, 
             kde_kws= {'cumulative': True} )

x,y = ax.lines[0].get_data()
thresh = 0.8
idx = np.where(np.diff(np.sign(y-thresh)))[0]
x_val = x[idx[0]]
ax.axvline(x_val, color='red')

enter image description here

Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75
  • The answer can be improved further to find the exact interpolated x value as in [this magic post](https://stackoverflow.com/questions/46909373/how-to-find-the-exact-intersection-of-a-curve-as-np-array-with-y-0/46911822#46911822). – JohanC Jul 20 '20 at 13:43
  • I select @JohanC answer because it is simpler, but this answer is also excellent. This one can be more efficient since it isn't necessary to reprocess the histogram. I also liked that it gives an interpolated value instead of one that is present in the population. – neves Jul 20 '20 at 14:54