1

If I have a data-frame like so:

enter image description here

generated with:

import pandas as pd
import numpy as np

df = pd.DataFrame({'dataset': ['dataset1']*2 + ['dataset2']*2 + ['dataset3']*2,
                   'frame': [1,2] * 3,
                   'result1': np.random.randn(6),
                   'result2': np.random.randn(6),
                   'result3': np.random.randn(6),
                   'method': ['A']*3 + ['B']*3
                  })
df = df.set_index(['dataset','frame'])
df

How can I transform it, so that I have multi-indexed columns, where the values in column 'method' are level 0 of the multi-index. Missing values should be filled in like, e.g. like so:

enter image description here The final goal is that I want to be able to easily compare corresponding values in the columns 'result1', 'result2', 'result3' between method 'A' and 'B'.

packoman
  • 1,230
  • 1
  • 16
  • 36

1 Answers1

1

You can add method to MultiIndex by DataFrame.set_index, reshape by DataFrame.unstack and last DataFrame.swaplevel with DataFrame.sort_index:

df = df.set_index('method', append=True).unstack().swaplevel(1,0, axis=1).sort_index(axis=1)
print (df)
method                 A                             B                    
                 result1   result2   result3   result1   result2   result3
dataset  frame                                                            
dataset1 1      1.488609  1.130858  0.409016       NaN       NaN       NaN
         2      0.676011  0.645002  0.102751       NaN       NaN       NaN
dataset2 1     -0.418451  0.106414 -1.907722       NaN       NaN       NaN
         2           NaN       NaN       NaN -0.806521  0.422155  1.100224
dataset3 1           NaN       NaN       NaN  0.555876  0.124207 -1.402325
         2           NaN       NaN       NaN -0.705504 -0.837953 -0.225081

#if need remove second level
df = df.reset_index(level=1, drop=True)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    I just realized that for my actual usage `append=True` is critical in `set_index('method',append=True)` or I will get the error: `ValueError: Index contains duplicate entries, cannot reshape` This is also described [here](https://stackoverflow.com/a/34855662/653770). – packoman Jun 17 '21 at 09:15