0

After pivoting a dataframe with two values like below:

import pandas as pd

df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
                       'foo', 'bar', 'foo', 'bar'],
            'B' : ['one', 'one', 'two', 'two',
                      'two', 'two', 'one', 'two'],
            'C' : [56, 2, 3, 4, 5, 6, 0, 2],
            'D' : [51, 2, 3, 4, 5, 6, 0, 2]})

pd.pivot_table(df, values=['C','D'],rows='B',cols='A').unstack().reset_index()

When I unstack the pivot and reset the index two new columns 'level_0' and 0 are created. Level_0 contains the column names C and D and 0 contains the values.

    level_0     A   B   0
0   C   bar     one     2.0
1   C   bar     two     4.0
2   C   foo     one     28.0
3   C   foo     two     4.0
4   D   bar     one     2.0
5   D   bar     two     4.0
6   D   foo     one     25.5
7   D   foo     two     4.0

Is it possible to unstack the frame so each value (C,D) appears in a separate column or do I have to split and concatenate the frame to achieve this? Thanks.

edited to show desired output:

    A   B   C   D
0   bar one 2   2
1   bar two 4   4
2   foo one 28  25.5
3   foo two 4   4
JAB
  • 12,401
  • 6
  • 45
  • 50

1 Answers1

4

You want to stack (and not unstack):

In [70]: pd.pivot_table(df, values=['C','D'],rows='B',cols='A').stack()
Out[70]: 
          C     D
B   A            
one bar   2   2.0
    foo  28  25.5
two bar   4   4.0
    foo   4   4.0

Although the unstack you used did a 'stack' operation because you had no MultiIndex in the index axis (only in the column axis).

But actually, you can get there also (and I think more logical) with a groupby-operation, as this is what you actually do (group columns C and D by A and B):

In [72]: df.groupby(['A', 'B']).mean()
Out[72]: 
          C     D
A   B            
bar one   2   2.0
    two   4   4.0
foo one  28  25.5
    two   4   4.0
joris
  • 133,120
  • 36
  • 247
  • 202
  • Thanks Joris. Groupby works in this example, but in the real application I have non-unique dates so I am using pivot to put the dimensions in the horizontal axis and to group the dates so each date in the index is unique so they can be resampled. Stack I think will do the trick. – JAB Mar 26 '14 at 16:12
  • You can always give the `level` keyword to stack to specify which level of the columns should go to the index (default it is the last one). BTW, I don't know the exact application, but you can also group on dates/datetimes in groupby to create unique dates. – joris Mar 26 '14 at 16:15
  • pd.pivot_table(df, values=['C','D'],rows=['B', 'A']) should give the same result? – user1827356 Mar 26 '14 at 16:24
  • @user1827356 Indeed! For `pivot_table`, that's the way to go, although I find the groupby personally more intuitive. – joris Mar 26 '14 at 21:46