I have a dataframe which looks like this
START_DATE END_DATE START_HOUR END_HOUR CURTAILED_OTC_MW
0 04-11-2015 31-10-2016 1.0 18.0 700.0
1 04-11-2015 31-10-2016 1.0 18.0 700.0
2 30-01-2016 30-01-2016 5.0 14.0 708.0
3 30-01-2016 30-01-2016 15.0 18.0 658.0
4 04-11-2015 31-03-2016 1.0 24.0 1368.0
5 04-11-2015 31-03-2016 1.0 24.0 1368.0
I am trying to group the dataframe based on four columns and want the output to look something like this
START_DATE END_DATE START_HOUR END_HOUR CURTAILED_OTC_MW
0 04-11-2015 31-10-2016 1.0 18.0 1400.0
1 30-01-2016 30-01-2016 5.0 14.0 1364.0
2 04-11-2015 31-03-2016 1.0 24.0 2736.0
When I try to use groupby
,it gives me the following output
g = data.groupby(['START_DATE','END_DATE','START_HOUR','END_HOUR']).sum()
CURTAILED_OTC_MW
START_DATE END_DATE START_HOUR END_HOUR
01-01-2002 31-01-2020 1.0 24.0 200187.0
01-01-2011 31-01-2020 1.0 19.0 91000.0
24.0 182000.0
01-01-2016 01-01-2016 6.0 14.0 9937.0
16.0 19874.0
How do I create a timeseries based on this for every day in2016 which would have every hour in 2016 followed by MW value for that hour?
For example, for 01/01/2016
Date
01-01-2016 00:00 382187.0
01-01-2016 01:00 382187.0
01-01-2016 02:00 382187.0
01-01-2016 03:00 382187.0
01-01-2016 04:00 382187.0
01-01-2016 05:00 402061.0
01-01-2016 06:00 402061.0
01-01-2016 07:00 402061.0
01-01-2016 08:00 402061.0
01-01-2016 09:00 402061.0
01-01-2016 10:00 402061.0
01-01-2016 11:00 402061.0
01-01-2016 12:00 402061.0
01-01-2016 13:00 402061.0
01-01-2016 14:00 402061.0
01-01-2016 15:00 402061.0
01-01-2016 16:00 402061.0
01-01-2016 17:00 382187.0
01-01-2016 18:00 382187.0
01-01-2016 19:00 382187.0
01-01-2016 20:00 382187.0
01-01-2016 21:00 382187.0
01-01-2016 22:00 382187.0
01-01-2016 23:00 382187.0