I am trying to add missing dates to my dataframe.
I have seen this posts: reindex and reindex2.
When I try to reindex my dataframe:
print(df)
df = df.reindex(dates, fill_value=0)
print(df)
I get the following output:
_updated_at Name hour day date time data1 data2
06/06/2016 13:27 game_name 13 6 06/06/2016 evening 0 0
07/06/2016 10:33 game_name 10 7 07/06/2016 morning 145.2788 122.7361
18/10/2016 14:34 game_name 14 18 18/10/2016 evening 0 0
19/10/2016 17:12 game_name 17 19 19/10/2016 evening 0 0
24/10/2016 11:05 game_name 11 24 24/10/2016 morning 313.5954 364.4107
24/10/2016 12:02 game_name 12 24 24/10/2016 evening 0 0
25/10/2016 08:50 game_name 8 25 25/10/2016 morning 362.4682 431.5803
25/10/2016 13:00 game_name 13 25 25/10/2016 evening 0 0
_updated_at Name hour day date time data1 data2
24/10/2016 0 0 0 0 0 0 0
25/10/2016 0 0 0 0 0 0 0
26/10/2016 0 0 0 0 0 0 0
27/10/2016 0 0 0 0 0 0 0
28/10/2016 0 0 0 0 0 0 0
29/10/2016 0 0 0 0 0 0 0
30/10/2016 0 0 0 0 0 0 0
I am expecting to see the rows where a date is missing filled with the new row and 0's in each value, rather than all rows replaced with 0.
EDIT: The overall goal is to be able to calculate the difference between values resulting in a morning and evening diff on a per day basis.
EDIT2: Current output:
print (df.reindex(mux, fill_value=0).groupby(level=0)['data1'].diff(-1).dropna())
dtypes: float64(2)None
2016-06-06 morning 0.00000
2016-06-07 morning 440.99582
2016-06-08 morning 0.00000
2016-06-09 morning 0.00000
2016-06-10 morning 0.00000
print (df.reindex(mux, fill_value=0).groupby(level=0)['data2'].diff(-1).dropna())
Length: 142, dtype: float64
2016-06-06 morning -220.5481
2016-06-07 morning 0.0000
2016-06-08 morning 0.0000
2016-06-09 morning 0.0000
2016-06-10 morning 0.0000
2016-06-11 morning 0.0000
I was expecting to see evening
values