1

I'd like to reshape a pandas dataframe from wide to long. The challenge lies in the fact that the columns have got multiindexed column headers. The dataframe looks like this:

category   price1           price2          
year       2011 2012 2013   2011 2012 2013
1            33   22   48    135  144  149
2            22   26   37    136  127  129
3            39   30   47    123  148  148
4            45   42   21    140  126  121
5            20   37   35    141  142  147
6            29   20   34    122  121  132
7            20   35   45    128  123  130
8            39   34   49    125  120  131
9            24   20   36    122  146  130
10           24   37   43    142  133  138
11           23   22   40    124  135  131
12           27   22   40    147  149  132

Below is a snippet that produces the very same dataframe. You will also see that I've built this dataframe by concatenating two other dataframes.

Here's the snippet:

import pandas as pd
import numpy as np

# Make dataframe df1 with 12 observations over 3 years
# with multiindexed column headers
np.random.seed(123)
df1 = pd.DataFrame(np.random.randint(20, 50, size = (12,3)), columns=[2011,2012,2013])
df1.index = np.arange(1,len(df1)+1)
colNames1 = df1.columns
header1 = pd.MultiIndex.from_product([['price1'], colNames1], names=['category','year'])
df1.columns = header1

# Make dataframe df2 with 12 observations over 3 years
# with multiindexed column headers
df2 = pd.DataFrame(np.random.randint(120, 150, size = (12,3)), columns=[2011,2012,2013])
df2.index = np.arange(1,len(df2)+1)
colNames1 = df2.columns
header1 = pd.MultiIndex.from_product([['price2'], colNames1], names=['category','year'])
df2.columns = header1

df3 = pd.concat([df1, df2], axis = 1)

And here is the desired output:

        price1  price2
1   2011    33  135
2   2011    22  136
3   2011    39  123
4   2011    45  140
5   2011    20  141
6   2011    29  122
7   2011    20  128
8   2011    39  125
9   2011    24  122
10  2011    24  142
11  2011    23  124
12  2011    27  147
1   2012    22  144
2   2012    26  127
3   2012    30  148
4   2012    42  126
5   2012    37  142
6   2012    20  121
7   2012    35  123
8   2012    34  120
9   2012    20  146
10  2012    37  133
11  2012    22  135
12  2012    22  149
1   2013    48  149
2   2013    37  129
3   2013    47  148
4   2013    21  121
5   2013    35  147
6   2013    34  132
7   2013    45  130
8   2013    49  131
9   2013    36  130
10  2013    43  138
11  2013    40  131
12  2013    40  132

I've tried different solutions based on suggestions with Reshape and pandas.wide_to_long, but I'm struggling with the multiindexed column names. So why not just remove this? Mostly because this is what my real world problem will look like, and also because I refuse to believe that it can't be done.

Thank you for any suggestions!

vestland
  • 55,229
  • 37
  • 187
  • 305

1 Answers1

2

Use stack be last level and sort_index, add rename_axis and reset_index for columns:

df3 = (df3.stack()
         .sort_index(level=[1,0])
        .rename_axis(['months','year'])
        .reset_index()
        .rename_axis(None, 1))
print (df3.head(15))
    months  year  price1  price2
0        1  2011      33     135
1        2  2011      22     136
2        3  2011      39     123
3        4  2011      45     140
4        5  2011      20     141
5        6  2011      29     122
6        7  2011      20     128
7        8  2011      39     125
8        9  2011      24     122
9       10  2011      24     142
10      11  2011      23     124
11      12  2011      27     147
12       1  2012      22     144
13       2  2012      26     127
14       3  2012      30     148

If need MutliIndex:

df3 = df3.stack().sort_index(level=[1,0])
print (df3.head(15))
category  price1  price2
   year                 
1  2011       33     135
2  2011       22     136
3  2011       39     123
4  2011       45     140
5  2011       20     141
6  2011       29     122
7  2011       20     128
8  2011       39     125
9  2011       24     122
10 2011       24     142
11 2011       23     124
12 2011       27     147
1  2012       22     144
2  2012       26     127
3  2012       30     148
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252