I would like to create a scatter plot in matplotlib from the following data:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'year': [2008, 2014, 2019, 2019.25, 2019.5, 2020],
'y': [2, 3, 8, 12, 63, 71],
'total_students': [800, 1000, 4000, 4500, 11000, 37000],
'male_students': [600, 700, 2000, 2100, 5100, 27000]})
I want to plot year
(unevenly spaced years) on the X-axis and y
on the y-axis. I would like the marker size to show the two extra variables (total number of students and male students) by overlaying two dots, sized accordingly, per data point. I would therefore like the radius of the markers to scale in a logical way, sothat the viewer can estimate the proportion of male to female students.
I've tried a few things and had the following problems:
- When changing the size in the scatter command, the scaling is not as expected (i.e. s=10 doesn't have twice the radius of s=5).
- I've tried adding circles instead with both
plt.Circle()
and the circle function explained in this post: https://stackoverflow.com/a/24567352/5895788 . However, they require me to set aspect equal, which collapses my X axis completely. I don't quite understand why this happens - maybe because of my time scale data? Not setting the aspect equal or doingax.set_aspect(1./ax.get_data_ratio())
causes the circles to become elliptical.
I'm not sure if this has any value, but here are some examples of what I tried:
## Option 1: scatter with s parameter
scaling=0.05
plt.scatter(df.year,df.y,marker="o",zorder=1,c="red",s=df.total_students*scaling)
plt.scatter(df.year,df.y,marker="o",zorder=2,c="blue",s=df.male_students*scaling)
plt.show()
## Option 2: Trying to add circles for the first data point without equal aspect ratio
fig=plt.figure()
ax=fig.add_subplot(111)
plt.ylim((-5,100))
plt.xlim((2007,2021))
scaling=0.01
ax.add_patch(plt.Circle((2008,2),800*scaling))
plt.show()
## Option 3: Trying to set equal aspect ratio
fig=plt.figure()
ax=fig.add_subplot(111,aspect="equal")
plt.ylim((-5,100))
plt.xlim((2007,2021))
scaling=0.01
ax.add_patch(plt.Circle((2008,2),800*scaling))
plt.show()