I think you need groupby
by first 3
chars of first column by str[:3]
with mean
:
df = df['Mkt-RF'].groupby(df['Unnamed:0'].str[:3]).mean()
Sample:
df = pd.DataFrame({'Unnamed:0':['192607','192608','193609','193610','193611'],
'Mkt-RF':[4,5,6,7,5]})
print (df)
Mkt-RF Unnamed:0
0 4 192607
1 5 192608
2 6 193609
3 7 193610
4 5 193611
#rename column
df = df.rename(columns={'Unnamed:0':'YEARMONTH'})
df = df['Mkt-RF'].groupby(df.YEARMONTH.str[:3]).mean().rename('MEAN').reset_index()
df.YEARMONTH = (df.YEARMONTH + '0').astype(int)
print (df)
YEARMONTH MEAN
0 1920 4.5
1 1930 6.0
Another solution is convert first to_datetime
and groupby
by year
floor divided by 10
:
df = df.rename(columns={'Unnamed:0':'YEARMONTH'})
df.YEARMONTH = pd.to_datetime(df.YEARMONTH, format='%Y%m')
df = df['Mkt-RF'].groupby(df.YEARMONTH.dt.year // 10).mean().rename('MEAN').reset_index()
df.YEARMONTH = df.YEARMONTH *10
print (df)
YEARMONTH MEAN
0 1920 4.5
1 1930 6.0