11

Imagine a Pandas dataframe of the following format:

id  type  v1  v2
1   A     6   9
1   B     4   2
2   A     3   7
2   B     3   6

I would like to convert this dataframe into the following format:

id  A_v1  A_v2  B_v1  B_v2
1   6     9     4     2
2   3     7     3     6

Is there an elegant way of doing this?

Logister
  • 1,852
  • 23
  • 26

1 Answers1

14

You could use set_index to move the type and id columns into the index, and then unstack to move the type index level into the column index. You don't have to worry about the v values -- where the indexes go dictate the arrangement of the values.

The result is a DataFrame with a MultiIndex for the column index:

In [181]: df.set_index(['type', 'id']).unstack(['type'])
Out[181]: 
     v1    v2   
type  A  B  A  B
id              
1     6  4  9  2
2     3  3  7  6

Generally, a MultiIndex is preferable to a flattened column index. It gives you better ways to select or manipulate your data based on type or v value.

If you wish to reorder the columns to exactly match the order shown in the desired output, you could use df.reindex:

df = df.reindex(columns=sorted(df.columns, key=lambda x: x[::-1]))

yields

     v1 v2 v1 v2
type  A  A  B  B
id              
1     6  9  4  2
2     3  7  3  6

And if you wish to flatten the column index to a single level, then

df.columns = ['{}_{}'.format(t, v) for v,t in df.columns]

yields

    A_v1  A_v2  B_v1  B_v2
id                        
1      6     9     4     2
2      3     7     3     6
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677