1

I have plotted two variables against each other in Seaborn and used the hue keyword to separate the variables into two categories.

I want to annotate each regression line with the coefficient of determination. This question only describes how to show the labels for a line with using the legend.

 import pandas as pd 
 import seaborn as sns
 import matplotlib.pyplot as plt 

df = pd.read_excel(open('intubation data.xlsx', 'rb'), sheet_name='Data 
(pretest)', header=1, na_values='x')
vars_of_interest = ['PGY','Time (sec)','Aspirate (cc)']
df['Resident'] = df['PGY'] < 4

 lm = sns.lmplot(x=vars_of_interest[1], y=vars_of_interest[2],
        data=df, hue='Resident', robust=True, truncate=True,
        line_kws={'label':"bob"})
mac389
  • 3,004
  • 5
  • 38
  • 62
  • Please read [How to create a Minimal, Complete, and Verifiable example](https://stackoverflow.com/help/mcve). Moreover, the link your provided don't annotate. It is just showing the legend. There is a difference between the two – Sheldore Jan 26 '19 at 21:14
  • I understand that there is a difference between annotating and showing a legend. I would be open to either. I think the legend in my example is occupied showing the hue categories. I can't show the data I have (it's health care data). I'll try to make up similar data for an MWE. – mac389 Jan 26 '19 at 21:40
  • You can manually set the legends using plt.legend – Sheldore Jan 26 '19 at 21:41
  • I would prefer an annotation, as the question heading indicates. I can do it manually in matplotlib. I wondered if there was a more elegant way in seaborn via lmplot or the underlying regplot. The manual is unclear. dir(*plot) is unclear. – mac389 Jan 26 '19 at 21:48
  • Seaborn does not give you access to the fitting parameters. It also does not give you access to the individual lines when hue is used. Best create the plot with matplotlib such that you have full control over what and where to annotate. – ImportanceOfBeingErnest Jan 27 '19 at 10:22
  • @mac389 check out the solution provided – RMS Jan 14 '20 at 12:16

1 Answers1

3

Using your code as it is:

 import pandas as pd 
 import seaborn as sns
 import matplotlib.pyplot as plt 

df = pd.read_excel(open('intubation data.xlsx', 'rb'), sheet_name='Data 
(pretest)', header=1, na_values='x')
vars_of_interest = ['PGY','Time (sec)','Aspirate (cc)']
df['Resident'] = df['PGY'] < 4

p = sns.lmplot(x=vars_of_interest[1], y=vars_of_interest[2],
        data=df, hue='Resident', robust=True, truncate=True,
        line_kws={'label':"bob"}, legend=True)
# assuming you have 2 groups
ax = p.axes[0, 0]
ax.legend()
leg = ax.get_legend()
L_labels = leg.get_texts()
# assuming you computed r_squared which is the coefficient of determination somewhere else
label_line_1 = r'$R^2:{0:.2f}$'.format(0.3)
label_line_2 = r'$R^2:{0:.2f}$'.format(0.21)
L_labels[0].set_text(label_line_1)
L_labels[1].set_text(label_line_2)

Voila: Graph created with my own random data since OP hasn't provided any. enter image description here

RMS
  • 1,350
  • 5
  • 18
  • 35