1

I have some data that looks like this, and called 'test_df'

  ID  Year  Value  Value2
0  A  2012      1       4
1  A  2012      2       5
2  A  2013      4       6
3  A  2013      5       7
4  B  2014      6       8
5  B  2014      7       4
6  B  2013      8       8

I want it to look like this:

ID Year  Value_avg  Value2_avg
A  2012  1.5        4.5
A  2013  4.5        6.5
B  2013  8.0        8.0
B  2014  6.5        6.0

However, when I try to group by multiple columns they end up as group by objects:

         Value_avg  Value2_avg
ID Year
A  2012        1.5         4.5
   2013        4.5         6.5
B  2013        8.0         8.0
   2014        6.5         6.0

Here is the code I tried:

out_df = pd.DataFrame()
out_df['Value_avg'] = test_df['Value'].groupby([test_df['ID'], test_df['Year']]).mean()
out_df['Value2_avg'] = test_df['Value2'].groupby([test_df['ID'], test_df['Year']]).mean()

I tried adding:

out_df['Value_avg'] = test_df['Value'].groupby([test_df['ID'], 
test_df['Year']], as_index=False).mean()

but got this error:

"TypeError: as_index=False only valid with DataFrame"
dasvootz
  • 413
  • 1
  • 5
  • 15

1 Answers1

5

add_suffix + reset_index

df.groupby(['ID','Year']).mean().add_suffix('_avg').reset_index()
Out[337]: 
  ID  Year  Value_avg  Value2_avg
0  A  2012        1.5         4.5
1  A  2013        4.5         6.5
2  B  2013        8.0         8.0
3  B  2014        6.5         6.0
BENY
  • 317,841
  • 20
  • 164
  • 234