0

I would like to match a number of events to the right time ranges, meaning the event is taking place after the start of the range, but before the end of it. I wonder what if there is a cleaner way to do that than this one?

Here is how I am trying to do it now. Not optimal in my opinion.

import pandas as pd

events = pd.DataFrame([[1], [5], [8], [10]], 
                      columns=['event_time'])
ranges = pd.DataFrame([[0, 5], [5, 7], [7, 9]], 
                      columns=['start_time', 'end_time'])

print(events)
   event_time
0           1
1           5
2           8
3          10

print(ranges)
   start_time  end_time
0           0         5
1           5         7
2           7         9

First, I find the potential ranges that start before the event - but don't look at the end time of the range yet.

df = pd.merge_asof(events,
                   ranges,
                   left_on='event_time',
                   right_on='start_time',
                   allow_exact_matches=True)

Here how it looks

print(df)
   event_time  start_time  end_time
0           1           0         5
1           5           5         7
2           8           7         9
3          10           7         9

But the last row is actually not what I want. The range already ended at 9, so event at 10 should not be matched to it.

So I will handle this with this one

df.loc[df['event_time'] > df['end_time'], ['start_time', 'end_time']] = None

print(df)

   event_time  start_time  end_time
0           1         0.0       5.0
1           5         5.0       7.0
2           8         7.0       9.0
3          10         NaN       NaN

However, I wonder if there is a way to do this within one merge_asof-like function in a cleaner way?

MattiH
  • 554
  • 5
  • 9

0 Answers0