How do you rename all columns in multi level group by in pandas 0.20.1+

Question

With the release of Pandas 0.20.1, there is a new deprecation of the functionality to groupby.agg() with a dictionary for renaming.

Deprecation documentation

I'm trying to find best way to update my code to account for this, however I'm struggling with how I've currently been utilizing this rename functionality.

When I am doing an aggregate, I often have multiple functions for each source column, and I have been using this rename functionality to get to a single level index with these new column names.

Example:

df = pd.DataFrame({'A': [1, 1, 1, 2, 2],'B': range(5),'C': range(5)})

In [30]: df
Out[30]: 
   A  B  C
0  1  0  0
1  1  1  1
2  1  2  2
3  2  3  3
4  2  4  4

frame = df.groupby('A').agg({'B' : {'foo':'sum'}, 'C': {'bar' : 'min', 'bar2': 'max'}})

Which results in:

Out[33]: 
    B   C     
  foo bar bar2
A             
1   3   0    2
2   7   3    4

Which I then typically do:

frame = pd.DataFrame(frame).reset_index(col_level=1)

frame.columns = frame.columns.get_level_values(1)

frame
Out[42]: 
   A  foo  bar  bar2
0  1    3    0     2
1  2    7    3     4

So I'm looking for good ways to get a result dataframe that is single level index, but has new unique column names. Where multiple columns originated from an aggregate from a single source column. Any recommendations of best approach is greatly appreciated.

jezrael · Accepted Answer · 2017-05-10T14:36:42.770

19

This works perfectly in 0.20.1 version:

d = {'sum':'foo','min':'bar','max':'bar2'}
frame = df.groupby('A').agg({'B' : ['sum'], 'C': ['min', 'max']}).rename(columns=d)
frame.columns = frame.columns.droplevel(0)
frame = frame.reset_index()
print (frame)
   A  foo  bar  bar2
0  1    3    0     2
1  2    7    3     4

If multiple mins:

d = {'B_sum':'foo','C_min':'bar','C_max':'bar2'}
frame = df.groupby('A').agg({'B' : ['sum'], 'C': ['min', 'max']})
frame.columns = frame.columns.map('_'.join)
frame = frame.reset_index().rename(columns=d)
print (frame)
   A  foo  bar  bar2
0  1    3    0     2
1  2    7    3     4

edited May 10 '17 at 14:36

answered May 10 '17 at 14:28

jezrael

822,522
95
1,334
1,252

Thank you, overall I think I can work with that. Only challenge I have is if similar or same agg function is used more than once. I often use lambda functions across my columns. Simple example though would be if I added: 'B' : ['sum','min'] and then had two columns as 'min'. – Mark Doom May 10 '17 at 14:34
Yes, then need first create `Index` with `Multiindex`, simplier solution is with `map`. Check edited answer. – jezrael May 10 '17 at 14:37
Perfect. Makes sense! – Mark Doom May 10 '17 at 14:40
Thanks for your usual contribution to pandas related question. It introduced a new function `droplevel` to me – Bowen Liu Jan 31 '19 at 19:25

score 7 · Answer 2 · answered May 10 '17 at 15:06

7

Here is bit shorter alternative:

In [78]: d={'C_min':'min_C', 'C_sum':'sum_C','B_min':'min_B','B_sum':'sum_B'}

In [79]: frame
Out[79]:
    C       B
  min sum min sum
A
1   0   3   0   3
2   3   7   3   7

In [80]: frame.columns = frame.columns.map('_'.join).to_series().map(d)

In [81]: frame
Out[81]:
   min_C  sum_C  min_B  sum_B
A
1      0      3      0      3
2      3      7      3      7

answered May 10 '17 at 15:06

MaxU - stand with Ukraine

205,989
36
386
419

3

Instead of having to type out the mapping `d` manually, use `.swaplevel()` instead. Thus, the whole thing would look `frame.columns.swaplevel().map('_'.join)` – Gene Burinsky Jun 27 '19 at 19:48

score 0 · Answer 3 · answered May 10 '17 at 14:27

0

You could just call droplevel on the columns and then reset_index:

In [46]:
frame.columns = frame.columns.droplevel(0)
frame = frame.reset_index()
frame

Out[46]:
   A  bar  bar2  foo
0  1    0     2    3
1  2    3     4    7

answered May 10 '17 at 14:27

EdChum

376,765
198
813
562

2

As in Pandas 0.23.4, `np.columns.droplevel()` is no longer available. – yoonghm Dec 02 '18 at 14:34

How do you rename all columns in multi level group by in pandas 0.20.1+

3 Answers3

Linked