1

I am new to python and dataframes. I have a dataframe with the following structure:

ID   |DATE       |COLUMN_1|COLUMN_2|COLUMN_3|

ID_1 |2017-04-01 |VALA    |VALB    |VALC    |

ID_1 |2016-12-31 |VALD    |VALE    |VALF    |

ID_1 |2016-09-24 |VALG    |VALH    |VALI    |

ID_2 |2008-06-30 |VALJ    |VALK    |VALL    |

ID_2 |2008-03-31 |VALM    |VALN    |VALO    |

ID_2 |2007-12-31 |VALP    |VALQ    |VALR    |

ID_2 |2007-09-30 |VALS    |VALT    |VALU    |

ID_3 |2017-04-01 |VALV    |VALW    |VALY    |

ID_3 |2016-12-31 |VALZ    |VALZ1   |VALZ2   |

I need to reshape it, so that it gets grouped by the Date column sorted descending, and all combinations of ID + existing columns are expanded to new columns. The header and the data should look something like this:

DATE_GROUP|ID_1_COLUMN_1|ID_1_COLUMN_2|ID_1_COLUMN_3| ID_2_COLUMN_1|ID_2_COLUMN_2|ID_2_COLUMN_3|ID_3_COLUMN_1|ID_3_COLUMN_2|ID_3_COLUMN_3|

2017-04-01|VALA|VALB|VALC|NONE|NONE|NONE|VALV|VALW|VALY|

2016-12-31|VALD|VALE|VALF|NONE|NONE|NONE|VALZ|VALZ1|VALZ2|

2016-09-24|VALG|VALH|VALI|NONE|NONE|NONE|NONE|NONE|NONE|

2008-06-30|NONE|NONE|NONE|VALJ|VALK|VALL|NONE|NONE|NONE

2008-03-31|NONE|NONE|NONE|VALM|VALN|VALO|NONE|NONE|NONE

2007-12-31|NONE|NONE|NONE|VALP|VALQ|VALO|NONE|NONE|NONE

2007-09-30|NONE|NONE|NONE|VALS|VALT|VALU|NONE|NONE|NONE

Had a look at here which got me started but couldn't quite get to the same output structure.

cristi.calugaru
  • 571
  • 10
  • 22

1 Answers1

0

Use:

  • first reshape by set_index + unstack
  • double sort_index - for index and for second level of MultiIndex in columns
  • flaten Multiindex in columns by list comprehension

df = (df.set_index(['DATE', 'ID'])
       .unstack()
       .sort_index(ascending=False)
       .sort_index(axis=1, level=1))
df.columns = ['{}_{}'.format(b,a) for a, b in df.columns]

print (df)

           ID_1_COLUMN_1 ID_1_COLUMN_2 ID_1_COLUMN_3 ID_2_COLUMN_1  \
DATE                                                                 
2017-04-01          VALA          VALB          VALC          None   
2016-12-31          VALD          VALE          VALF          None   
2016-09-24          VALG          VALH          VALI          None   
2008-06-30          None          None          None          VALJ   
2008-03-31          None          None          None          VALM   
2007-12-31          None          None          None          VALP   
2007-09-30          None          None          None          VALS   

           ID_2_COLUMN_2 ID_2_COLUMN_3 ID_3_COLUMN_1 ID_3_COLUMN_2  \
DATE                                                                 
2017-04-01          None          None          VALV          VALW   
2016-12-31          None          None          VALZ         VALZ1   
2016-09-24          None          None          None          None   
2008-06-30          VALK          VALL          None          None   
2008-03-31          VALN          VALO          None          None   
2007-12-31          VALQ          VALR          None          None   
2007-09-30          VALT          VALU          None          None   

           ID_3_COLUMN_3  
DATE                      
2017-04-01          VALY  
2016-12-31         VALZ2  
2016-09-24          None  
2008-06-30          None  
2008-03-31          None  
2007-12-31          None  
2007-09-30          None  
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252