1

I would like to rearrange cross-section into panel data.

Unfortunately I was not able to rearrange it with the pd.melt or pd.pivot formula. I also searched in the stack overflow community for a solution but did not find an answer which matches the problem.

The current data have the following form:

ID  FA_17   FA_18   TA_17   TA_18   OR_17   OR_18
-----------------------------------------------------
ID1  590     304     313     682     671     499
ID2  832     344     482     223     580     789
ID3  598     138     125     712     775     761

In the end they should be transformed into the following order:

     FA      TA      OR
ID1 FA_17   TA_17    OR_17
ID1 FA_18   TA_18    OR_18
ID2 FA_17   TA_17    OR_17
ID2 FA_18   TA_18    OR_18
ID3 FA_17   TA_17    OR_17
ID3 FA_18   TA_18    OR_18

Thank you very much in advance!

bharatk
  • 4,202
  • 5
  • 16
  • 30
Alexander Hempfing
  • 247
  • 1
  • 2
  • 9

1 Answers1

2

Create index by all columns with no _ by DataFrame.set_index, then create MultiIndex in columns by Series.str.split and reshape by DataFrame.stack, last remove second level of MultiIndex in index by DataFrame.reset_index with drop=True:

df = df.set_index('ID')
df.columns = df.columns.str.split('_', expand=True)
df = df.stack().reset_index(level=1, drop=True)

print (df)
      FA   OR   TA
ID                
ID1  590  671  313
ID1  304  499  682
ID2  832  580  482
ID2  344  789  223
ID3  598  775  125
ID3  138  761  712

Or use wide_to_long:

df1 = (pd.wide_to_long(df, stubnames=['FA','TA','OR'], j='p', i='ID', sep='_')
         .reset_index(level=1, drop=True)
         .sort_index())
print (df1)
      FA   TA   OR
ID                
ID1  590  313  671
ID1  304  682  499
ID2  832  482  580
ID2  344  223  789
ID3  598  125  775
ID3  138  712  761
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252