I have a dataframe as follows:
import numpy as np
import pandas as pd
import datetime as dt
dates = ['2000-01-01', '2000-01-03', '2000-01-03', '2000-01-04']
dates_list =[ dt.datetime.strptime(date, '%Y-%m-%d').date() for date in dates]
df = pd.DataFrame(np.random.rand(4, 2),
index=[['a', 'a', 'b', 'b'], dates_list],
columns=['data1', 'data2'])
-
df
data1 data2
a 2000-01-01 0.928427 0.029868
2000-01-03 0.243949 0.140281
b 2000-01-03 0.760319 0.897673
2000-01-04 0.288415 0.919107
I want to impute the missing max and min date rows for every index value, with some value set for the data columns (i.e 0). Such that my dataframe would look like:
data1 data2
a 2000-01-01 0.507132 0.649724
2000-01-02 0.0 0.0
2000-01-03 0.500277 0.859769
2000-01-04 0.0 0.0
b 2000-01-01 0.0 0.0
2000-01-02 0.0 0.0
2000-01-03 0.136397 0.166815
2000-01-04 0.102095 0.918545
I've tried using the methods described in: pandas fill missing dates in time series and Add missing dates to pandas dataframe as:
df.loc[dates_list]
->
TypeError: unhashable type: 'DatetimeIndex'
but am unable to get it to work for a multilevel indexed df.