I would like to match a number of events to the right time ranges, meaning the event is taking place after the start of the range, but before the end of it. I wonder what if there is a cleaner way to do that than this one?
Here is how I am trying to do it now. Not optimal in my opinion.
import pandas as pd
events = pd.DataFrame([[1], [5], [8], [10]],
columns=['event_time'])
ranges = pd.DataFrame([[0, 5], [5, 7], [7, 9]],
columns=['start_time', 'end_time'])
print(events)
event_time
0 1
1 5
2 8
3 10
print(ranges)
start_time end_time
0 0 5
1 5 7
2 7 9
First, I find the potential ranges that start before the event - but don't look at the end time of the range yet.
df = pd.merge_asof(events,
ranges,
left_on='event_time',
right_on='start_time',
allow_exact_matches=True)
Here how it looks
print(df)
event_time start_time end_time
0 1 0 5
1 5 5 7
2 8 7 9
3 10 7 9
But the last row is actually not what I want. The range already ended at 9, so event at 10 should not be matched to it.
So I will handle this with this one
df.loc[df['event_time'] > df['end_time'], ['start_time', 'end_time']] = None
print(df)
event_time start_time end_time
0 1 0.0 5.0
1 5 5.0 7.0
2 8 7.0 9.0
3 10 NaN NaN
However, I wonder if there is a way to do this within one merge_asof-like function in a cleaner way?