0

Reading a matrix data with stacked columns in pandas

Hello ,

Newbie to Python here. Need some help in reading xls file with following format as pandas dataframe and writing it as a record

xls layout looks like this

InputIMage

    |   SP500 | DJIA    
    |   Equity| Equity
    |   Large | BlueChip
-----------------------------           
1-Apr-23 |  3.51  | 2.56
1-Mar-23 |  -2.61 | 1.89
1-Feb-23 |  6.18  | -4.06
1-Jan-23 |  -5.9  | 2.83

Basically its a timeseries of performance return of 2 major indices.

I'd like to flatten it out something like this

OutputDataFrameImage

Date     | Index | AssetClass | Type     | Return
1-Apr-23 | SP500 | Equities   | Large    | 3.51
1-Mar-23 | SP500 | Equities   | Large    | -2.61
1-Feb-23 | SP500 | Equities   | Large    | 6.18
1-Jan-23 | SP500 | Equities   | Large    | -5.9
1-Apr-23 | DJIA  | Equities   | BlueChip | 2.56
1-Mar-23 | DJIA  | Equities   | BlueChip | 1.89
1-Feb-23 | DJIA  | Equities   | BlueChip | -4.06
1-Jan-23 | DJIA  | Equities   | BlueChip | 2.83

For a single column header, I used dataframe.melt() function to flatten it out. I am unsure how I could achieve it for stacked columns.

Appreciate your help

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52

1 Answers1

1

It the same for multiindex columns

out = df.melt(ignore_index=False, var_name=['Index', 'AssetClass', 'Type'], value_name='Return')

NOTE: ensure you read the xlsx file with header=[0,1,2]) argument to get the multiindex columns

print(out)

          Index AssetClass      Type  Return
1-Apr-23  SP500     Equity     Large    3.51
1-Mar-23  SP500     Equity     Large   -2.61
1-Feb-23  SP500     Equity     Large    6.18
1-Jan-23  SP500     Equity     Large   -5.90
1-Apr-23   DJIA     Equity  BlueChip    2.56
1-Mar-23   DJIA     Equity  BlueChip    1.89
1-Feb-23   DJIA     Equity  BlueChip   -4.06
1-Jan-23   DJIA     Equity  BlueChip    2.83
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52