8

I have a multiindex pandas dataframe that looks like this

 ID            I                   II                  III
 METRIC        a    b    c    d    a    b    c    d    a    b    c    d
 2015-08-01    0    1    2    3    20   21   22   23   40   41   42   43
 2015-08-02    4    5    6    7    24   25   26   27   44   45   46   47
 2015-08-03    8    9    10   11   28   29   30   31   48   49   50   51

where it is indexed by the dates (2015-08-01, 2015-08-02, 2015-08-03, etc.), the first-level columns (I, II, III) are IDs and the second-level columns are corresponding METRICs (a, b, c, d). I would like to reshape it to the following

METRIC               a    b    c    d
ID
I      2015-08-01    0    1    2    3
       2015-08-02    4    5    6    7
       2015-08-03    8    9    10   11
II     2015-08-01    20   21   22   23
       2015-08-02    24   25   26   27
       2015-08-03    28   29   30   31
III    2015-08-01    40   41   42   43
       2015-08-02    44   45   46   47
       2015-08-03    48   49   50   51

I have (unsuccessfully) looked into using .pivot, .stack, and .melt, but they don't give me what I am looking for. I currently loop over IDs and build a list of dataframes and concat them together as a new dataframe to get what I want.

Any suggestions would be greatly appreciated.

darXider
  • 447
  • 5
  • 16

3 Answers3

9

Let's use stack, swaplevel and sort_index:

df.stack(0).swaplevel(0,1).sort_index()

Output:

METRIC           a   b   c   d
ID                            
I   2015-08-01   0   1   2   3
    2015-08-02   4   5   6   7
    2015-08-03   8   9  10  11
II  2015-08-01  20  21  22  23
    2015-08-02  24  25  26  27
    2015-08-03  28  29  30  31
III 2015-08-01  40  41  42  43
    2015-08-02  44  45  46  47
    2015-08-03  48  49  50  51
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • 1
    Simple and Clear!! – BENY Jul 10 '17 at 15:28
  • @Wen yes it is :-) – piRSquared Jul 10 '17 at 15:41
  • Thank you. It took me a while to understand the process, but now it makes sense. Nice solution :) Quick question: in my original dataframe, some `ID`s (say, `V`) have rows (say, at `2015-08-02`) where all `METRIC`s are `NaN`s; these get removed by this solution. Why is that? – darXider Jul 10 '17 at 15:52
4

You can let transpose or T do some of the work for you.

df.T.stack().unstack(1)

METRIC           a   b   c   d
ID                            
I   2015-08-01   0   1   2   3
    2015-08-02   4   5   6   7
    2015-08-03   8   9  10  11
II  2015-08-01  20  21  22  23
    2015-08-02  24  25  26  27
    2015-08-03  28  29  30  31
III 2015-08-01  40  41  42  43
    2015-08-02  44  45  46  47
    2015-08-03  48  49  50  51
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • Oooh.. I like using T here! Nice, Mr. piRSquared! +1 – Scott Boston Jul 10 '17 at 15:42
  • Thank you. I tried transposing at first, but I'm still not comfortable with using `stack` and `unstack`, so I couldn't figure out how to reorder things. This solution is very concise and nice. Quick question: in my original dataframe, some `ID`s (say, `V`) have rows (say, at `2015-08-02`) where all `METRIC`s are `NaN`s; these get removed by this solution. Why is that? – darXider Jul 10 '17 at 15:53
  • 1
    @darXider `stack` takes levels from the columns object into the index object. When it does, it drops nulls. You can avoid that by `stack(dropna=False)` – piRSquared Jul 10 '17 at 16:02
0

Using @piRSquared's method, we can skip the transpose, just df.unstack().unstack(1)

Frank
  • 505
  • 5
  • 14