56

I'm trying to left join multiple pandas dataframes on a single Id column, but when I attempt the merge I get warning:

KeyError: 'Id'.

I think it might be because my dataframes have offset columns resulting from a groupby statement, but I could very well be wrong. Either way I can't figure out how to "unstack" my dataframe column headers. None of the answers at this question seem to work.

My groupby code:

step1 = pd.DataFrame(step3.groupby(['Id', 'interestingtabsplittest2__grp'])['applications'].sum())
step1.sort('applications', ascending=False).head(3)

Returns:

offset headers

How to get those offset headers into the top level?

Community
  • 1
  • 1
samthebrand
  • 3,020
  • 7
  • 41
  • 56

2 Answers2

99

You're looking for .reset_index().

In [11]: df = pd.DataFrame([[2, 3], [5, 6]], pd.Index([1, 4], name="A"), columns=["B", "C"])

In [12]: df
Out[12]:
   B  C
A
1  2  3
4  5  6

In [13]: df.reset_index()
Out[13]:
   A  B  C
0  1  2  3
1  4  5  6

Note: That you can avoid this step by using as_index=False when doing the groupby.

step1 = step3.groupby(['Id', 'interestingtabsplittest2__grp'], as_index=False)['applications'].sum()
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • 21
    If you implemented agg function at the end of the group. The column group couldn't be flatten by as_index. I found the answer from this [link](https://stackoverflow.com/a/50571897/7782929). Hope it helps for some people like me. – Kwang-Chun Kang Dec 20 '18 at 02:17
  • 1
    The solution that worked for me is `df.reset_index(drop=True, inplace=True)` The **drop=True** was the critical part. – Shane S Dec 08 '21 at 19:29
29

The accepted answer doesn't work if you do multiple aggregation with .agg() or if you're grouping by multiple columns

You can instead drop the topmost level(s) and then reset the index.

df.droplevel(axis=1, level=0).reset_index()

Here, I have dropped only one level but you can pass an array instead as well:

df.droplevel(axis=1, level=[0,1]).reset_index()
Shayan RC
  • 3,152
  • 5
  • 33
  • 40