I have a dataframe that records number of observations at different locations for different years. I am trying to make a barplot where I can show the total number of observations at different locations for different years. For each location, I want the total observations, for different years to be shown in different colors. My approach is to first make location groups and for each location group, calculate total observation. (I don't think I need to change the index to date - as I am grouping by location).I am not able to achieve this using the following code. Help will be much appreciated.
fig, ax = plt.subplots(figsize=(40,15))
date=df['date']
value=df['value']
df.date = pd.to_datetime(df.date)
year_start=2015
year_stop = 2019
#ax=plt.gca()
for year in range(year_start, year_stop+1):
ax=plt.gca()
m=df.groupby(['location']).agg({'value': ['count']})
plt.ylim(0,45000)
m.plot(kind='bar', legend = False, figsize=(30,15), fontsize = 30)
#ax.tick_params(axis='both', which='major', labelsize=25)
plt.ylabel('Number of observations - O3', fontsize = 30, fontweight = 'bold')
plt.legend(loc='upper right', prop={'size': 7})
fig_title='Diurnal_'+place
plt.savefig(fig_title, format='png',dpi=500, bbox_inches="tight")
print ('saved=', fig_title)
plt.show()
The header looks like this:
date_utc date parameter \
212580 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
212581 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
212582 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
212583 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
212584 {utc=2020-01-05T05:45:00.000Z 2020-01-05T11:15:00+05:30 o3
location value unit city \
212580 ICRISAT Patancheru, Mumbai - TSPCB 37.7 µg/m³ Hyderabad
212581 Bollaram Industrial Area, Surat - TSPCB 39.5 µg/m³ Hyderabad
212582 IDA Pashamylaram, Surat - TSPCB 17.8 µg/m³ Hyderabad
212583 Sanathnagar, Hyderabad - TSPCB 56.6 µg/m³ Hyderabad
212584 Zoo Park, Hyderabad - TSPCB 24.5 µg/m³ Hyderabad