python pandas dataframe resample.last how to make sure data comes from the same row

Question

I have a dataframe looks like this ,

df = pd.DataFrame({'col1':range(9), 'col2': list(range(6)) + [np.nan] *3}, 
    index = pd.date_range('1/1/2000', periods=9, freq='T'))

df
Out[63]: 
                     col1  col2
2000-01-01 00:00:00     0   0.0
2000-01-01 00:01:00     1   1.0
2000-01-01 00:02:00     2   2.0
2000-01-01 00:03:00     3   3.0
2000-01-01 00:04:00     4   4.0
2000-01-01 00:05:00     5   5.0
2000-01-01 00:06:00     6   NaN
2000-01-01 00:07:00     7   NaN
2000-01-01 00:08:00     8   NaN

and when I perform resample by method last,

df.resample('3T', label='right', closed='right').last()
Out[60]: 
                     col1  col2
2000-01-01 00:00:00     0   0.0
2000-01-01 00:03:00     3   3.0
2000-01-01 00:06:00     6   5.0
2000-01-01 00:09:00     8   NaN

As can be seen above, the 6th minute row has data on col1, so after resample, col1 is filled with data on 6th minute row, but col2 is filled with 5th minute row, is there is a way to make sure both data after resample come from 6th minute row, that means if col1 has data, the resample will not fill col2's NaN with last, but simply leave it as is?

Out[60]: 
                     col1  col2
2000-01-01 00:00:00     0   0.0
2000-01-01 00:03:00     3   3.0
2000-01-01 00:06:00     6   NaN  <--- if there at least one col has data,the whole row will be used in resample
2000-01-01 00:09:00     8   NaN

Relevant: https://stackoverflow.com/questions/55583246/what-is-different-between-groupby-first-groupby-nth-groupby-head-when-as-index/55583395#55583395 — ALollz, Apr 29 '19 at 03:23

score 4 · Accepted Answer · answered Apr 29 '19 at 03:23

That is how last work in pandas , it will return the last notnull value for group , if you want to get the last value (included nan, check with iloc with apply )

df.resample('3T', label='right', closed='right').apply(lambda x : x.iloc[-1])
Out[103]: 
                     col1  col2
2000-01-01 00:00:00     0   0.0
2000-01-01 00:03:00     3   3.0
2000-01-01 00:06:00     6   NaN
2000-01-01 00:09:00     8   NaN

score 0 · Answer 2 · answered Apr 29 '19 at 03:29

Also possible with .nth(-1) or .tail(1) using ceil to form the same groups:

df.groupby(df.index.ceil('3T')).nth(-1)
                     col1  col2
2000-01-01 00:00:00     0   0.0
2000-01-01 00:03:00     3   3.0
2000-01-01 00:06:00     6   NaN
2000-01-01 00:09:00     8   NaN

python pandas dataframe resample.last how to make sure data comes from the same row

2 Answers2

Linked