I'm trying to add annotations to lmplots in a FacetGrid (r and p values for each regression) but the plots have two regression lines because I'm using "hue", and therefore I get two annotations that are stacked on top of each other. I'd like to either specify that they are displayed in different locations or ideally to use the complete dataset, not separated by the argument passed to hue I assume for that I need to modify "data" in the annotate function but I cannot figure out how.
I did manage to do it by creating a dataframe that contains all r and p values and looping through g.axes_dict.items()
, but I would like a more elegant solution where the values can be calculated and displayed directly
import pandas as pd
import seaborn as sns
import scipy as sp
dict = {
'ID': ['A','B','C','D','A','B','C','D','A','B','C','D','A','B','C','D'],
'SCORE': [18,20,37,40,34,21,24,12,34,54,23,43,23,31,65,78],
'AGE': [34,54,46,65,43,23,54,23,43,54,23,32,56,42,12,43],
'GENDER': [1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2]
}
df = pd.DataFrame(dict)
g = sns.lmplot(x='SCORE', y='AGE', data=df,hue='GENDER',
col='ID', height=3, aspect=1)
def annotate(data, **kws):
r, p = sp.stats.pearsonr(data['SCORE'], data['AGE'])
ax = plt.gca()
ax.text(.05, .8, 'r={:.2f}, p={:.2g}'.format(r, p),
transform=ax.transAxes)
g.map_dataframe(annotate)