pandas between_time(): datetime series between two other datetime series

Question

If I have one DataFrame of start and end times like:

2015-11-21 16:00:00 2015-11-22 00:30:00
2015-05-16 12:15:00 2015-05-16 22:03:00
2015-10-15 16:00:00 2015-10-15 23:30:00

And then a Series of timestamps like this:

2015-11-21 18:42:13
2015-11-21 00:32:00
2015-05-16 12:37:00

And I want to be able to filter that Series to get only the times within the above times, so in this case:

2015-11-21 18:42:13
2015-05-16 12:37:00

series[series.between_time(df.start, df.end)], won't work because it needs scalar times; series[df.start < series < df.end] but that give me a ValueError because the Series aren't identically labelled.

I feel like there must be a way to do this with a Timedelta or something similar, but I can't figure out how.

score 3 · Answer 1 · answered Mar 28 '17 at 06:32

You can use first cross join and then filter by query:

print (df)
                start                 end
0 2015-11-21 16:00:00 2015-11-22 00:30:00
1 2015-05-16 12:15:00 2015-05-16 22:03:00
2 2015-10-15 16:00:00 2015-10-15 23:30:00

print (series)
0   2015-11-21 18:42:13
1   2015-11-21 00:32:00
2   2015-05-16 12:37:00 <-changed date
Name: date, dtype: datetime64[ns]

df = pd.merge(df.assign(A=1), series.to_frame().assign(A=1), on='A')
df = df.query('start < date < end')['date']
print (df)
0   2015-11-21 18:42:13
5   2015-05-16 12:37:00
Name: date, dtype: datetime64[ns]

piRSquared · Answer 2 · 2017-03-28T06:39:09.453

1

Assuming your dataframe is df and series is series
We can use searchsorted to see if the sorted location of each value in series lands in the same row.

df.sort_values('start', inplace=True)

series.loc[
    df.iloc[:, 0].searchsorted(series) - 1 == df.iloc[:, 1].searchsorted(series)
]

0   2015-11-21 18:42:13
2   2015-05-16 12:37:00
dtype: datetime64[ns]

We can numpyfy this with

v = df.sort_values('start').values
s = series.values

series.loc[v[:, 0].searchsorted(s) - 1 == v[:, 1].searchsorted(s)]

0   2015-11-21 18:42:13
2   2015-05-16 12:37:00
dtype: datetime64[ns]

edited Mar 28 '17 at 06:39

answered Mar 28 '17 at 06:36

piRSquared

285,575
57
475
624

`searchsorted` finds location to insert before. So if date > start, the index it inserts at is index + 1. – piRSquared Mar 28 '17 at 06:43

pandas between_time(): datetime series between two other datetime series

2 Answers2