1

I'm trying to rename two columns after using a groupby of two columns.

fun = {'Age':{'mean_age':'mean', 'median_age':'median'}}
groupbyClass2 = mc_response.groupby(['Country','GenderSelect']).agg(fun).reset_index()
groupbyClass2.columns = groupbyClass2.columns.droplevel(0) 

The dataframe looks like the following:

                        mean_age    median_age
0   Argentina   Female  33.000000   33.0
1   Argentina   Male    33.294118   32.0
2   Australia   Female  35.000000   34.0
3   Australia   Male    37.158416   36.0

Now I want to rename the first column to 'Country' and the second column to 'Gender'. I tried the following code however the two columns will both be renamed to 'Gender'. How can I fix this?

groupbyClass2.rename(columns = {groupbyClass2.columns[0]:'Country', groupbyClass2.columns[1]:'Gender'},inplace = True)
Alex Wang
  • 411
  • 3
  • 16

2 Answers2

3

You can specify column for aggregate after groupby, so possible pass list of tuples for new columns names with aggregate functions, add DataFrame.rename_axis for MultiIndex names for columns names after reset_index:

print (mc_response)
     Country GenderSelect  Age
0  Argentina       Female   10
1  Australia         Male   20
2  Australia       Female   30
3  Australia         Male   43

fun = [('mean_age', 'mean'), ('median_age','median')]
groupbyClass2 = (mc_response.groupby(['Country','GenderSelect'])['Age']
                            .agg(fun)
                            .rename_axis(['Country','Gender'])
                            .reset_index())
print (groupbyClass2)
     Country  Gender  mean_age  median_age
0  Argentina  Female      10.0        10.0
1  Australia  Female      30.0        30.0
2  Australia    Male      31.5        31.5

Your solution should working with assign new list - first values in list and then convert all columns with indexing:

df.columns = ['Country','Gender'] + df.columns[2:].tolist()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Hi thanks for the solution. So what is the difference between specifying the 'Age' attribute in the function and outside it? – Alex Wang Mar 31 '19 at 12:12
  • @AlexWang - It is possible solution, maybe help [this](https://stackoverflow.com/questions/53781634/aggregation-in-pandas) for better explanation. – jezrael Mar 31 '19 at 12:14
0

This might be easier:

df.groupby(['Country','Gender'])['Age'].agg([np.mean,np.median]).add_suffix('_age').reset_index()

Output:

     Country  Gender  mean_age  median_age
0  Argentina  Female      10.0        10.0
1  Australia  Female      30.0        30.0
2  Australia    Male      31.5        31.5
BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169