Pandas long to wide form to resample

Question

I'm trying to work out how to resample the following data. The trick is that there are two variables which I need (identifier and variable) to maintain during that resample.

I've tried re-sampling using the following code:

df.set_index('date').to_period('d').resample('d').fillna(method='ffill')

But it doesnt allow me to given the index is not unique. So, I thought maybe if i take it to wide form and make it multi-level columns i could resample the re-shaped dataframe while maintaining the identifier and variable information.

original data format is as follows:

date, identifier, variable, value
2018-04-01, BBN, M3M, 24
2018-04-07, BBN, M3M, 25
2018-04-14, BBN, M3M, 25
2018-04-21, BBN, M3M, 25
2018-04-01, BBN, M12M, 48
2018-04-07, BBN, M12M, 52
2018-04-14, BBN, M12M, 50
2018-04-21, BBN, M12M, 49
2018-04-01, MXP, M3M, 18
2018-04-07, MXP, M3M, 20
2018-04-14, MXP, M3M, 25
2018-04-21, MXP, M3M, 22
2018-04-01, MXP, M12M, -10
2018-04-07, MXP, M12M, -12
2018-04-14, MXP, M12M, -14
2018-04-21, MXP, M12M, -18

Expected reshape:

      identifier  BBN,       MXP
      variable    M3M, M12M,M3M,M12M
date  2018-04-01, 24, 48, 18, -10
date  2018-04-07, 25, 52, 20, -12 
date  2018-04-14, 25, 50, 25, -14
date  2018-04-21, 25, 49, 22, -18

I think that once its in this format, the date index should be unique, then I'll be able to re-sample the data to include missing days. I'm just not sure how to pivot table to get the multiple level columns.

Can any body point me in the right direction, or provide any help or suggestions???

This is a commonly asked duplicate – cs95 May 02 '18 at 03:29 — cs95, May 02 '18 at 03:29

score 0 · Answer 1 · answered May 02 '18 at 03:25

Fixing your code by using set_index and unstack

df.set_index(['date','identifier','variable'])['value'].unstack([1,2])
Out[558]: 
 identifier  BBN        MXP      
 variable    M3M  M12M  M3M  M12M
date                             
2018-04-01    24    48   18   -10
2018-04-07    25    52   20   -12
2018-04-14    25    50   25   -14
2018-04-21    25    49   22   -18

Or pivot_table

df.pivot_table(index='date',columns=['identifier','variable'],values='value')
Out[560]: 
identifier   BBN        MXP     
variable    M12M  M3M  M12M  M3M
date                            
2018-04-01    48   24   -10   18
2018-04-07    52   25   -12   20
2018-04-14    50   25   -14   25
2018-04-21    49   25   -18   22

thank you so much! always such a simple solution.. – firstworldchild May 02 '18 at 03:32 — firstworldchild, May 02 '18 at 03:32

Pandas long to wide form to resample

1 Answers1