I'm trying to work out how to resample the following data. The trick is that there are two variables which I need (identifier and variable) to maintain during that resample.
I've tried re-sampling using the following code:
df.set_index('date').to_period('d').resample('d').fillna(method='ffill')
But it doesnt allow me to given the index is not unique. So, I thought maybe if i take it to wide form and make it multi-level columns i could resample the re-shaped dataframe while maintaining the identifier and variable information.
original data format is as follows:
date, identifier, variable, value
2018-04-01, BBN, M3M, 24
2018-04-07, BBN, M3M, 25
2018-04-14, BBN, M3M, 25
2018-04-21, BBN, M3M, 25
2018-04-01, BBN, M12M, 48
2018-04-07, BBN, M12M, 52
2018-04-14, BBN, M12M, 50
2018-04-21, BBN, M12M, 49
2018-04-01, MXP, M3M, 18
2018-04-07, MXP, M3M, 20
2018-04-14, MXP, M3M, 25
2018-04-21, MXP, M3M, 22
2018-04-01, MXP, M12M, -10
2018-04-07, MXP, M12M, -12
2018-04-14, MXP, M12M, -14
2018-04-21, MXP, M12M, -18
Expected reshape:
identifier BBN, MXP
variable M3M, M12M,M3M,M12M
date 2018-04-01, 24, 48, 18, -10
date 2018-04-07, 25, 52, 20, -12
date 2018-04-14, 25, 50, 25, -14
date 2018-04-21, 25, 49, 22, -18
I think that once its in this format, the date index should be unique, then I'll be able to re-sample the data to include missing days. I'm just not sure how to pivot table to get the multiple level columns.
Can any body point me in the right direction, or provide any help or suggestions???