Usually, we work with data where the rows are time, so it's good practice to do the same and transpose your data by starting with df = Housing.set_index('CountyName').T
(also, variable names should usually start with a small letter, but this isn't important here).
Since your data is already in such a nice format, there is a pragmatic (in the sense that you need not know too much about datetime
objects and methods) solution, starting with df = Housing.set_index('CountyName').T
:
df.reset_index(inplace = True) # This moves the dates to a column named 'index'
df.rename(columns = {'index':'quarter'}, inplace = True) # Rename this column into something more meaningful
# Rename the months into the appropriate quarters
df.quarter.str.replace('-01|-02|-03', 'q1', inplace = True)
df.quarter.str.replace('-04|-05|-06', 'q2', inplace = True)
df.quarter.str.replace('-07|-08|-09', 'q3', inplace = True)
df.quarter.str.replace('-10|-11|-12', 'q4', inplace = True)
df.drop('SizeRank', inplace = True) # To avoid including this in the calculation of means
c = df.notnull().sum(axis = 1) # Count the number of non-empty entries
df['total'] = df.sum(axis = 1) # The totals on each month
df['c'] = c # only ssign c after computing the total, so it doesn't intefere with the total column
g = df.groupby('quarter')[['total','c']].sum()
g['q_mean'] = g['total']/g['c']
g
g['q_mean']
or g[['q_mean']]
should give you the required answer.
Note that we needed to compute the mean manually because you had missing data; otherwise, df.groupby('quarter').mean().mean()
would have immediately given you the answer you needed.
A remark: the technically 'correct' way would be to convert your dates into a datetime
-like object (which you can do with the pd.to_datetime()
method), then run a groupby with a pd.TimeGrouper()
argument; this would certainly be worth learning more about if you are going to work with time-indexed data a lot.