1

As per the title there's a list of available/possible time slots and then there's a list of booked slots. What I need a helping hand with, is a streamlined way, using pandas, to extract booked time slots from the possible ones and rebuild the free time slots data frame. Thank you very much

#1 Possible slots:

>>> df1
                  start                  end
0   2023-02-28 08:00:00  2023-02-28 08:30:00
1   2023-02-28 08:30:00  2023-02-28 09:00:00
2   2023-02-28 09:00:00  2023-02-28 09:30:00
3   2023-02-28 09:30:00  2023-02-28 10:00:00
4   2023-02-28 10:00:00  2023-02-28 10:30:00
5   2023-02-28 10:30:00  2023-02-28 11:00:00
6   2023-02-28 11:00:00  2023-02-28 11:30:00
7   2023-02-28 11:30:00  2023-02-28 12:00:00
8   2023-02-28 12:00:00  2023-02-28 12:30:00
9   2023-02-28 12:30:00  2023-02-28 13:00:00
10  2023-02-28 13:00:00  2023-02-28 13:30:00
11  2023-02-28 13:30:00  2023-02-28 14:00:00
12  2023-02-28 14:00:00  2023-02-28 14:30:00
13  2023-02-28 14:30:00  2023-02-28 15:00:00
14  2023-02-28 15:00:00  2023-02-28 15:30:00
15  2023-02-28 15:30:00  2023-02-28 16:00:00
>>> 

#2 Booked slots:

>>> df2
                 start                  end
0  2023-02-28 08:00:00  2023-02-28 08:15:00
1  2023-02-28 08:15:00  2023-02-28 08:30:00
2  2023-02-28 09:00:00  2023-02-28 09:30:00
3  2023-02-28 12:00:00  2023-02-28 12:45:00
4  2023-02-28 13:15:00  2023-02-28 14:45:00
>>> 

#3 The result should be:

>>> df3
                  start                  end
0   2023-02-28 08:30:00  2023-02-28 09:00:00
1   2023-02-28 09:30:00  2023-02-28 10:00:00
2   2023-02-28 10:00:00  2023-02-28 10:30:00
3   2023-02-28 10:30:00  2023-02-28 11:00:00
4   2023-02-28 11:00:00  2023-02-28 11:30:00
5   2023-02-28 11:30:00  2023-02-28 12:00:00
6   2023-02-28 12:45:00  2023-02-28 13:00:00
7   2023-02-28 13:00:00  2023-02-28 13:15:00
8   2023-02-28 14:45:00  2023-02-28 15:00:00
9   2023-02-28 15:00:00  2023-02-28 15:30:00
10  2023-02-28 15:30:00  2023-02-28 16:00:00
  • Have you tried to compare the two dataframes? There is an answer https://stackoverflow.com/questions/48647534/find-difference-between-two-data-frames – KimKulling Feb 28 '23 at 09:07

2 Answers2

0

You can repeat values in 5 minutes intervals by difference of both columns start/end in both DataFrames, create helper column new by add 5 minutes timedeltas by GroupBy.cumcount to start column and filter in Series.isin with boolean indexing, last aggregate to first and last value (or min and max values) with add 5 minutes to end column:

#5 minutes intervals, change if necessary
N = 5
df1['start'] = pd.to_datetime(df1['start'])
df1['end'] = pd.to_datetime(df1['end'])

df1 = df1.loc[df1.index.repeat(df1['end'].sub(df1['start']).dt.total_seconds().div(N*60))]
counter1 = df1.groupby(level=0).cumcount()
df1['new'] = df1['start'].add(pd.to_timedelta(counter1, unit='Min').mul(N))

df2['start'] = pd.to_datetime(df2['start'])
df2['end'] = pd.to_datetime(df2['end'])

df2 = df2.loc[df2.index.repeat(df2['end'].sub(df2['start']).dt.total_seconds().div(N*60))]
counter2 = df2.groupby(level=0).cumcount()
df2['new'] = df2['start'].add(pd.to_timedelta(counter2, unit='Min').mul(N))

df3 = (df1[~df1['new'].isin(df2['new'])].groupby(level=0).agg(start=('new', 'first'),
                                                              end=('new', 'last'))
           .assign(end = lambda x: x['end'].add(pd.Timedelta(f'{N}Min')))
           .reset_index(drop=True))
print (df3)
                 start                 end
0  2023-02-28 08:30:00 2023-02-28 09:00:00
1  2023-02-28 09:30:00 2023-02-28 10:00:00
2  2023-02-28 10:00:00 2023-02-28 10:30:00
3  2023-02-28 10:30:00 2023-02-28 11:00:00
4  2023-02-28 11:00:00 2023-02-28 11:30:00
5  2023-02-28 11:30:00 2023-02-28 12:00:00
6  2023-02-28 12:45:00 2023-02-28 13:00:00
7  2023-02-28 13:00:00 2023-02-28 13:15:00
8  2023-02-28 14:45:00 2023-02-28 15:00:00
9  2023-02-28 15:00:00 2023-02-28 15:30:00
10 2023-02-28 15:30:00 2023-02-28 16:00:00
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

It's actually quite simple. A time slot is used if either:

  • the start of a booked slot is ≥ the start of the slot and < its end
  • the end of a booked slot is ≤ the end of the slot and > its start

Use a merge_asof with boolean indexing:

# ensure datetime
df1[['start', 'end']] = df1[['start', 'end']].apply(pd.to_datetime)
df2[['start', 'end']] = df2[['start', 'end']].apply(pd.to_datetime)

# merge and filter:
out = (
 pd.merge_asof(df1, df2.add_suffix('2'), left_on='start', right_on='start2')
   .loc[lambda d: ~((d['start2'].ge(d['start']) & d['start2'].lt(d['end']))
                   |(d['end2'].le(d['end']) & d['end2'].gt(d['start']))
                   ),
        ['start', 'end']]
)

Output:

                 start                 end
1  2023-02-28 08:30:00 2023-02-28 09:00:00
3  2023-02-28 09:30:00 2023-02-28 10:00:00
4  2023-02-28 10:00:00 2023-02-28 10:30:00
5  2023-02-28 10:30:00 2023-02-28 11:00:00
6  2023-02-28 11:00:00 2023-02-28 11:30:00
7  2023-02-28 11:30:00 2023-02-28 12:00:00
10 2023-02-28 13:00:00 2023-02-28 13:30:00
11 2023-02-28 13:30:00 2023-02-28 14:00:00
12 2023-02-28 14:00:00 2023-02-28 14:30:00
14 2023-02-28 15:00:00 2023-02-28 15:30:00
15 2023-02-28 15:30:00 2023-02-28 16:00:00

Intermediates:

                 start                 end              start2                end2  start2_in_slot  end2_in_slot  available
0  2023-02-28 08:00:00 2023-02-28 08:30:00 2023-02-28 08:00:00 2023-02-28 08:15:00            True          True      False
1  2023-02-28 08:30:00 2023-02-28 09:00:00 2023-02-28 08:15:00 2023-02-28 08:30:00           False         False       True
2  2023-02-28 09:00:00 2023-02-28 09:30:00 2023-02-28 09:00:00 2023-02-28 09:30:00            True          True      False
3  2023-02-28 09:30:00 2023-02-28 10:00:00 2023-02-28 09:00:00 2023-02-28 09:30:00           False         False       True
4  2023-02-28 10:00:00 2023-02-28 10:30:00 2023-02-28 09:00:00 2023-02-28 09:30:00           False         False       True
5  2023-02-28 10:30:00 2023-02-28 11:00:00 2023-02-28 09:00:00 2023-02-28 09:30:00           False         False       True
6  2023-02-28 11:00:00 2023-02-28 11:30:00 2023-02-28 09:00:00 2023-02-28 09:30:00           False         False       True
7  2023-02-28 11:30:00 2023-02-28 12:00:00 2023-02-28 09:00:00 2023-02-28 09:30:00           False         False       True
8  2023-02-28 12:00:00 2023-02-28 12:30:00 2023-02-28 12:00:00 2023-02-28 12:45:00            True         False      False
9  2023-02-28 12:30:00 2023-02-28 13:00:00 2023-02-28 12:00:00 2023-02-28 12:45:00           False          True       True
10 2023-02-28 13:00:00 2023-02-28 13:30:00 2023-02-28 12:00:00 2023-02-28 12:45:00           False         False       True
11 2023-02-28 13:30:00 2023-02-28 14:00:00 2023-02-28 13:15:00 2023-02-28 14:45:00           False         False       True
12 2023-02-28 14:00:00 2023-02-28 14:30:00 2023-02-28 13:15:00 2023-02-28 14:45:00           False         False       True
13 2023-02-28 14:30:00 2023-02-28 15:00:00 2023-02-28 13:15:00 2023-02-28 14:45:00           False          True       True
14 2023-02-28 15:00:00 2023-02-28 15:30:00 2023-02-28 13:15:00 2023-02-28 14:45:00           False         False       True
15 2023-02-28 15:30:00 2023-02-28 16:00:00 2023-02-28 13:15:00 2023-02-28 14:45:00           False         False       True
mozway
  • 194,879
  • 13
  • 39
  • 75