How to Annotate Scatterplot Datapoints on Hover with Pandas Dataframe?

Question

I am new to matplotlib and trying to plot religious adherence and violent crime rates in California by county. I would like to add an annotation when I hover over the datapoint to display the county name. I tried to follow the guidance here, but am not completely following how to do it in this context.

The dataframe looks like this.

This is the code I've written so far to plot the points and add a regression line through the middle.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats

%matplotlib notebook

crime = pd.read_csv('example.csv')
crime = crime[(crime['reportyear']==2010) & (crime['geotype']=='CO')]
crime = crime[['reportyear', 'geotype', 'county_name', 'numerator', 'denominator', 'ratex1000']].rename(columns={'county_name':'county'})

religion = pd.read_excel('example2.xlsx')
religion = religion[(religion['STABBR']=='CA')]
religion['CNTYNAME'] = religion['CNTYNAME'].str.replace(' County','')
religion = religion[['CNTYNAME', 'POP2010', 'TOTCNG', 'TOTADH', 'TOTRATE']].rename(columns={'CNTYNAME':'county'})
#religion.head()

rel_crime = pd.merge(crime, religion, how='left', on='county')
rel_crime = rel_crime.rename(columns={'POP2010':'population_rel',
                                     'TOTCNG':'tot_congregations',
                                     'TOTADH':'tot_adherents',
                                     'TOTRATE':'rate_adherents'})

fig, ax = plt.subplots()
ax.scatter(rel_crime['rate_adherents'], rel_crime['ratex1000'])

m, b, r_value, p_value, std_err = scipy.stats.linregress(rel_crime['rate_adherents'], rel_crime['ratex1000'])
plt.plot(rel_crime['rate_adherents'], m*(rel_crime['rate_adherents'])+b, color='red')
ax.annotate('r^2: ' + str("{:.2f}".format(r_value**2)), (1000,8))

plt.title('Violent Crime Rate vs. Religious Adherence Rate \n by County (California, 2010)')
plt.xlabel('Adherents (All Denominations) per 1000')
plt.ylabel('Violent Crimes per 1000')

Could anyone anyone advise how to annotate each datapoint with the County name based on the ratex1000, and rate_adherents coordinates are on the graph?

See e.g. https://stackoverflow.com/questions/68276345/how-to-annotate-with-multiple-columns-using-mplcursors or https://stackoverflow.com/questions/68796153/how-to-add-entire-dataframe-row-as-scatter-plot-annotation . The post you linked to uses an older, more complicated method. Please note that StackOverflow's rules ask to add test data in text format. — JohanC, Oct 13 '22 at 22:30

How to Annotate Scatterplot Datapoints on Hover with Pandas Dataframe?

0 Answers0