I have a dataframe
looks like this ,
df = pd.DataFrame({'col1':range(9), 'col2': list(range(6)) + [np.nan] *3},
index = pd.date_range('1/1/2000', periods=9, freq='T'))
df
Out[63]:
col1 col2
2000-01-01 00:00:00 0 0.0
2000-01-01 00:01:00 1 1.0
2000-01-01 00:02:00 2 2.0
2000-01-01 00:03:00 3 3.0
2000-01-01 00:04:00 4 4.0
2000-01-01 00:05:00 5 5.0
2000-01-01 00:06:00 6 NaN
2000-01-01 00:07:00 7 NaN
2000-01-01 00:08:00 8 NaN
and when I perform resample
by method last
,
df.resample('3T', label='right', closed='right').last()
Out[60]:
col1 col2
2000-01-01 00:00:00 0 0.0
2000-01-01 00:03:00 3 3.0
2000-01-01 00:06:00 6 5.0
2000-01-01 00:09:00 8 NaN
As can be seen above, the 6th minute
row has data on col1
, so after resample, col1
is filled with data on 6th minute
row, but col2
is filled with 5th minute
row, is there is a way to make sure both data after resample come from 6th minute
row, that means if col1
has data, the resample will not fill col2
's NaN
with last, but simply leave it as is?
Out[60]:
col1 col2
2000-01-01 00:00:00 0 0.0
2000-01-01 00:03:00 3 3.0
2000-01-01 00:06:00 6 NaN <--- if there at least one col has data,the whole row will be used in resample
2000-01-01 00:09:00 8 NaN