48

Let's assume that I have the following dataframe in pandas:

         AA  BB  CC
 date
05/03     1   2   3
06/03     4   5   6
07/03     7   8   9
08/03     5   7   1

and I want to transform it to the following:

AA  05/03  1
AA  06/03  4
AA  07/03  7
AA  08/03  5
BB  05/03  2
BB  06/03  5
BB  07/03  8
BB  08/03  7
CC  05/03  3
CC  06/03  6
CC  07/03  9
CC  08/03  1

How can I do it?

The reason of the transformation from wide to long is that, in the next stage, I would like to merge this dataframe with another one, based on dates and the initial column names (AA, BB, CC).

cottontail
  • 10,268
  • 18
  • 50
  • 51
km1234
  • 2,127
  • 4
  • 18
  • 21
  • 1
    If you're actually looking for wide to long, not just to melt, see: [pandas.wide_to_long](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.wide_to_long.html) – BeRT2me Oct 28 '22 at 18:59

3 Answers3

62

Use pandas.melt or pandas.DataFrame.melt to transform from wide to long:

df = pd.DataFrame({
    'date' : ['05/03', '06/03', '07/03', '08/03'],
    'AA' : [1, 4, 7, 5],
    'BB' : [2, 5, 8, 7],
    'CC' : [3, 6, 9, 1]
}).set_index('date')
df

        AA  BB  CC
date            
05/03   1   2   3
06/03   4   5   6
07/03   7   8   9
08/03   5   7   1

To convert, we just need to reset the index and then melt:

df = df.reset_index()
pd.melt(df, id_vars='date', value_vars=['AA', 'BB', 'CC'])

Using .reset_index after .melt, removes the need to specify value_vars.

dfm = df.melt(ignore_index=False).reset_index()

Final Result - both options

     date variable  value
0   05/03       AA      1
1   06/03       AA      4
2   07/03       AA      7
3   08/03       AA      5
4   05/03       BB      2
5   06/03       BB      5
6   07/03       BB      8
7   08/03       BB      7
8   05/03       CC      3
9   06/03       CC      6
10  07/03       CC      9
11  08/03       CC      1
Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
George Liu
  • 3,601
  • 10
  • 43
  • 69
40

Update

As George Liu has shown in another answer, pd.melt is the idiomatic, flexible and fast solution to this problem. Do not use unstack for this.


unstack returns a series with a multiindex:

    In [38]: df.unstack()
    Out[38]: 
        date 
    AA  05/03    1
        06/03    4
        07/03    7
        08/03    5
    BB  05/03    2
        06/03    5
        07/03    8
        08/03    7
    CC  05/03    3
        06/03    6
        07/03    9
        08/03    1
    dtype: int64

You can call reset_index on the returning series:

In [39]: df.unstack().reset_index() 
Out[39]:        
        
    level_0 date    0
0   AA      05-03   1
1   AA      06-03   4
2   AA      07-03   7
3   AA      08-03   5
4   BB      05-03   2
5   BB      06-03   5
6   BB      07-03   8
7   BB      08-03   7
8   CC      05-03   3
9   CC      06-03   6
10  CC      07-03   9
11  CC      08-03   1

Or construct a dataframe with a multiindex:

In [40]: pd.DataFrame(df.unstack())     
Out[40]:        
        
            0
    date    
AA  05-03   1
    06-03   4
    07-03   7
    08-03   5
BB  05-03   2
    06-03   5
    07-03   8
    08-03   7
CC  05-03   3
    06-03   6
    07-03   9
    08-03   1
ayhan
  • 70,170
  • 20
  • 182
  • 203
0

Apart from unstack and melt, stack can also be used here.

df1 = df.stack().reset_index(name='value')

# change "weird" column label
df1 = df.stack().reset_index(name='value').rename(columns={'level_1': 'variable'})

All of melt, stack and unstack are very fast methods, so runtime differences will hardly matter in normal circumstances. If runtime is such a concern, a numpy based solution could also be used (that is about 50% faster than melt). The idea is to simply flatten the values in the frame into a 1D array and repeat the index and column labels along with it.

df1 = pd.DataFrame({ 'variable': np.tile(df.columns, len(df)), 'date': df.index.repeat(df.shape[1]), 'value': df.values.ravel()})

res


If column labels are not needed as a separate column, then another really fast function is pd.lreshape.

df1 = pd.lreshape(df.reset_index(), {'value': ['AA', 'BB', 'CC']})

res2

cottontail
  • 10,268
  • 18
  • 50
  • 51
  • What if instead of "date", I had multiple columns that I wanted to be index or reference? – PM0087 Aug 09 '23 at 09:22
  • @PM0087 are you asking “how to make wide-to-long conversion for MultiIndex dataframes”? Or are you asking about a long-to-wide conversion or something else altogether? – cottontail Aug 09 '23 at 14:06