96

I've seen a few variations on the theme of exploding a column/series into multiple columns of a Pandas dataframe, but I've been trying to do something and not really succeeding with the existing approaches.

Given a DataFrame like so:

    key       val
id
2   foo   oranges
2   bar   bananas
2   baz    apples
3   foo    grapes
3   bar     kiwis

I want to convert the items in the key series into columns, with the val values serving as the values, like so:

        foo        bar        baz
id
2   oranges    bananas     apples
3    grapes      kiwis        NaN

I feel like this is something that should be relatively straightforward, but I've been bashing my head against this for a few hours now with increasing levels of convolution, and no success.

Idan Gazit
  • 2,500
  • 3
  • 20
  • 24

2 Answers2

143

There are a few ways:

using .pivot_table:

>>> df.pivot_table(values='val', index=df.index, columns='key', aggfunc='first')
key      bar     baz      foo
id                           
2    bananas  apples  oranges
3      kiwis     NaN   grapes

using .pivot:

>>> df.pivot(index=df.index, columns='key')['val']
key      bar     baz      foo
id                           
2    bananas  apples  oranges
3      kiwis     NaN   grapes

using .groupby followed by .unstack:

>>> df.reset_index().groupby(['id', 'key'])['val'].aggregate('first').unstack()
key      bar     baz      foo
id                           
2    bananas  apples  oranges
3      kiwis     NaN   grapes
behzad.nouri
  • 74,723
  • 18
  • 126
  • 124
  • 1
    @behzad.nouri could you please also extend this where there is an additional column you want to keep around? like lets say there was an additional column C in the original? – Tommy Aug 27 '21 at 01:08
  • 1
    @Tommy add the columns you want to keep to the index. For example, df.pivot_table(values='val', index=['colA','colB'], columns='key', aggfunc='first') – Juha Palomäki Jan 16 '22 at 21:46
10

You could use set_index and unstack

In [1923]: df.set_index([df.index, 'key'])['val'].unstack()
Out[1923]:
key      bar     baz      foo
id
2    bananas  apples  oranges
3      kiwis    None   grapes

Or, a simplified groupby

In [1926]: df.groupby([df.index, 'key'])['val'].first().unstack()
Out[1926]:
key      bar     baz      foo
id
2    bananas  apples  oranges
3      kiwis    None   grapes
Zero
  • 74,117
  • 18
  • 147
  • 154