21

This is an algorithmic question about a somewhat complex problem. The foundation is this:

A scheduling system based on available slots and reserved slots. Slots have certain criteria, let's call them tags. A reservation is matched to an available slot by those tags, if the available slot's tag set is a superset of the reserved slot.

As a concrete example, take this scenario:

11:00  12:00  13:00
+--------+
| A, B   |
+--------+
       +--------+
       | C, D   |
       +--------+

Between the times of 11:00 to 12:30 reservations for the tags A and B can be made, from 12:00 to 13:30 C and D is available, and there's an overlap from about 12:00 to 12:30.

11:00  12:00  13:00
+--------+
| A, B   |
+--------+
       +--------+
       | C, D   |
       +--------+
  xxxxxx
  x A  x
  xxxxxx

Here a reservation for A has been made, so no other reservations for A or B can be made between 11:15-ish and 12:00-ish.

That's the idea in a nutshell. There are no specific limitations for the available slots:

  • an available slot can contain any number of tags
  • any number of slots can overlap at any time
  • slots are of arbitrary length
  • reservations can contain any number of tags

The only rule that needs to be obeyed in the system is:

  • when adding a reservation, at least one remaining available slot must match all the tags in the reservation

To clarify: when there are two available slots at the same time with, say, tag A, then two reservations for A can be made at that time, but no more.

I have that working with a modified implementation of an interval tree; as a quick overview:

  • all available slots are added to the interval tree (duplicates/overlaps are preserved)
  • all reserved slots are iterated and:
    • all available slots matching the time of the reservation are queried from the tree
    • the first of those matching the reservation's tags is sliced and the slice removed from the tree

When that process is finished, what's left are the remaining slices of available slots, and I can query whether a new reservation can be made for a particular time and add it.

Data structures look something like this:

{
  type: 'available', 
  begin: 1497857244, 
  end: 1497858244, 
  tags: [{ foo: 'bar' }, { baz: 42 }]
}
{
  type: 'reserved', 
  begin: 1497857345, 
  end: 1497857210, 
  tags: [{ foo: 'bar' }]
}

Tags are themselves key-value objects, a list of them is a "tag set". Those could be serialised if it helps; so far I'm using a Python set type which makes comparison easy enough. Slot begin/end times are UNIX time stamps within the tree. I'm not particularly married to these specific data structures and can refactor them if it's useful.


The problem I'm facing is that this doesn't work bug-free; every once in a while a reservation sneaks its way into the system that conflicts with other reservations, and I couldn't yet figure out how that can happen exactly. It's also not very clever when tags overlap in a complex way where the optimal distribution needs to be calculated so all reservations can be fit into the available slots as best as possible; in fact currently it's non-deterministic how reservations are matched to available slots in overlapping scenarios.

What I want to know is: interval trees are mostly great for this purpose, but my current system to add tag set matching as an additional dimension to this is clunky and bolted-on; is there a data structure or algorithm that can handle this in an elegant way?

Actions that must be supported:

  1. Querying the system for available slots that match certain tag sets (taking into account reservations that may reduce availability but are not themselves part of said tag set; e.g. in the example above querying for an availability for B).
  2. Ensuring no reservations can be added to the system which don't have a matching available slot.
deceze
  • 510,633
  • 85
  • 743
  • 889
  • You haven't explained inputs very clearly. Can a give reservation be assigned to any slot as long as the tags match correctly? Or do the reservations have time intervals attached? – Gene Jun 19 '17 at 04:54
  • I've clarified the data structures used. Yes, a reservation just needs to match the tags and time of an available slot to be acceptable. – deceze Jun 19 '17 at 07:36
  • instead of having all available slots in the tree and then removing them from that, would it make sense to start with an empty tree and then try to fill it with reservations, such that there are no conflicts? – Florian Castellane Jun 19 '17 at 08:04
  • @Florian I'm totally open to any approach. I just have a bunch of values according to the data structures I posted (and even those are malleable) and need to be able to 1) get still available slots filterable by tag sets and 2) check whether a new reservation can be added. – deceze Jun 19 '17 at 08:07
  • By the way, you are talking about calculating the optimal allocation for reservations ; but isn't "first come, first served" the idiom for reservations? Could you give an example where a previous reservation would need to change? – Florian Castellane Jun 19 '17 at 08:17
  • 1
    @Florian Take something like Apple Store Genius Bar reservations: you're just reserving a slot to talk to *someone* about your iPhone, say, it doesn't matter which particular slot or person you get. Just *someone* with the knowledge to talk to you about your iPhone needs to be available at that time, but there'll be multiple people and which one exactly you get will be figured out then and there. – deceze Jun 19 '17 at 08:20
  • " _The first [...] is sliced and the slice __removed [...] what's left are the remaining slices of available slots__, and I can query whether a new reservation can be made for a particular time and add it._ " are the slots removed from the tree as they get reserved? or is the type just set to `'reserved'` as your data structure suggests? – Florian Castellane Jun 19 '17 at 09:04
  • @Florian Currently I'm doing all these calculations as needed on the fly, the tree is not persisted anywhere. I'm pulling all the relevant slots of both types out of the data store, build the tree, slice the tree, and then either get the "remaining" available slots from it or check if a new reservation can be added. – I'm considering changing this to a persistent in-memory tree at the moment, so you're free to suggest solutions that use either approach. – deceze Jun 19 '17 at 09:09
  • "slots are of arbitrary length" - but there's some non-zero quantum of duration here, right? I'm thinking of slicing the entire time dimension into `n` duration quanta (eg hours, quarter hours, minutes, whatever the requirements actually are): then each quantum has for **availability** a set of (tag, count) associated with it. Reservations are then a start and end time and a set of (tag) which, when assigned, decrement the applicable count on the slot. Reservation checking is straightforward. – AakashM Jun 19 '17 at 10:44
  • @AakashM Yes, durations are non-zero. Practically speaking they'll be at least 5 minutes, but theoretically could be as small as 1 second (though that makes no practical sense). Your suggestion sounds intriguing, but you'd have to expand on that; can't quite wrap my head around it from the terse comment… :) – deceze Jun 19 '17 at 10:48

3 Answers3

8

Your problem can be solved using constraint programming. In python this can be implemented using the python-constraint library.

First, we need a way to check if two slots are consistent with each other. this is a function that returns true if two slots share a tag and their rimeframes overlap. In python this can be implemented using the following function

def checkNoOverlap(slot1, slot2):
    shareTags = False
    for tag in slot1['tags']:
        if tag in slot2['tags']:
            shareTags = True
            break    
    if not shareTags: return True
    return not (slot2['begin'] <= slot1['begin'] <= slot2['end'] or 
                slot2['begin'] <= slot1['end'] <= slot2['end'])

I was not sure whether you wanted the tags to be completely the same (like {foo: bar} equals {foo: bar}) or only the keys (like {foo: bar} equals {foo: qux}), but you can change that in the function above.

Consistency check

We can use the python-constraint module for the two kinds of functionality you requested.

The second functionality is the easiest. To implement this, we can use the function isConsistent(set) which takes a list of slots in the provided data structure as input. The function will then feed all the slots to python-constraint and will check if the list of slots is consistent (no 2 slots that shouldn't overlap, overlap) and return the consistency.

def isConsistent(set):
        #initialize python-constraint context
        problem = Problem()
        #add all slots the context as variables with a singleton domain
        for i in range(len(set)):
            problem.addVariable(i, [set[i]])        
        #add a constraint for each possible pair of slots
        for i in range(len(set)):
            for j in range(len(set)):
                #we don't want slots to be checked against themselves
                if i == j:
                    continue
                #this constraint uses the checkNoOverlap function
                problem.addConstraint(lambda a,b: checkNoOverlap(a, b), (i, j))
        # getSolutions returns all the possible combinations of domain elements
        # because all domains are singleton, this either returns a list with length 1 (consistent) or 0 (inconsistent)
        return not len(problem.getSolutions()) == 0

This function can be called whenever a user wants to add a reservation slot. The input slot can be added to the list of already existing slots and the consistency can be checked. If it is consistent, the new slot an be reserverd. Else, the new slot overlaps and should be rejected.

Finding available slots

This problem is a bit trickier. We can use the same functionality as above with a few significant changes. Instead of adding the new slot together with the existing slot, we now want to add all possible slots to the already existing slots. We can then check the consistency of all those possible slots with the reserved slots and ask the constraint system for the combinations that are consistent.

Because the number of possible slots would be infinite if we didn't put any restrictions on it, we first need to declare some parameters for the program:

MIN = 149780000 #available time slots can never start earlier then this time
MAX = 149790000 #available time slots can never start later then this time
GRANULARITY = 1*60 #possible time slots are always at least one minut different from each other

We can now continue to the main function. It looks a lot like the consistency check, but instead of the new slot from the user, we now add a variable to discover all available slots.

def availableSlots(tags, set):
    #same as above
    problem = Problem()
    for i in range(len(set)):
        problem.addVariable(i, [set[i]])
    #add an extra variable for the available slot is added, with a domain of all possible slots
    problem.addVariable(len(set), generatePossibleSlots(MIN, MAX, GRANULARITY, tags))
    for i in range(len(set) +1):
        for j in range(len(set) +1):
            if i == j:
                continue
            problem.addConstraint(lambda a, b: checkNoOverlap(a, b), (i, j))
    #extract the available time slots from the solution for clean output
    return filterAvailableSlots(problem.getSolutions())

I use some helper functions to keep the code cleaner. They are included here.

def filterAvailableSlots(possibleCombinations):
    result = []
    for slots in possibleCombinations:
        for key, slot in slots.items():
            if slot['type'] == 'available':
                result.append(slot)

    return result

def generatePossibleSlots(min, max, granularity, tags):
    possibilities = []
    for i in range(min, max - 1, granularity):
        for j in range(i + 1, max, granularity):
            possibleSlot = {
                              'type': 'available',
                              'begin': i,
                              'end': j,
                              'tags': tags
            }
            possibilities.append(possibleSlot)
    return tuple(possibilities)

You can now use the function getAvailableSlots(tags, set) with the tags for which you want the available slots and a set of already reserved slots. Note that this function really return all the consistent possible slots, so no effort is done to find the one of maximum lenght or for other optimalizations.

Hope this helps! (I got it to work as you described in my pycharm)

  • Question: you seem to make no distinction between available and reserved slots in the comparison method… am I supposed to feed all slots (both available and reserved) into the `Problem` without distinction? Then `checkNoOverlap` should make some concession there, since available slots may "overlap" as well, no? – deceze Jun 21 '17 at 08:16
  • Yes, you are supposed to feed the reserved slots, who are fixed, as well as an available slot that can vary over the range to the problem. The Problem class will then return all correct combinations of the reserved slots and a possibility of the domain of the available slot variable. – Arne Van Den Kerchove Jun 21 '17 at 08:33
  • Remember that it is important to add one variable for each reserved slot, with a domain of size one containing only that slot, and only one variable for the available slot, with a domain containing all the possible available slots generated – Arne Van Den Kerchove Jun 21 '17 at 08:42
  • That is for getting all available slots? I'm currently experimenting with the first step of consistency checking. – Shouldn't there be a difference between "overlapping" and "containing"? A reservation must be "contained" by an available slot, mere overlapping won't do. – deceze Jun 21 '17 at 08:44
  • So you would also like to place new reservations in predefined available slots and check if they don't overlap with other reserved slots and are contained in an available slot for that tag? – Arne Van Den Kerchove Jun 21 '17 at 08:48
  • Yes. Available slots are being defined first, then reservations can be added wherever there's an availability, taking into account already existing reservations which "subtract" from the availabilities. So I need to both get slots which are still available after subtracting existing reservations, and (as a corollary) consistency-check the whole set whenever trying to add a new reservation. – deceze Jun 21 '17 at 08:51
  • then you could easily add those available slots as a second set of variables in the problem, isolate the new reservation from the old ones in the input and add a new constraint that a new reservation should be contained in one of the available ones – Arne Van Den Kerchove Jun 21 '17 at 08:53
  • 1
    Something like that, yeah. Thanks for clarifying that your existing sample doesn't quite cover it. I'll continue experimenting with this. Would be happy about an updated sample from your side, obviously. :o) – deceze Jun 21 '17 at 08:55
  • Maybe this wasn't clear enough in the question: "available slots" express the availability of one person during a specific timeframe; that person can take multiple reservations within that timeframe, as long as they don't overlap. I.e. a single availability slot from 11:00 to 13:00 can pack eight 15 minute reservations. I don't think your approach considers this, does it? – deceze Jun 21 '17 at 12:24
  • Well you could set the minimum and maximum values for the possible slots to the start and end of the available slot for the user, or add a greater than and smaller than constraint for consistency checking – Arne Van Den Kerchove Jun 21 '17 at 12:37
4

Here's a solution, I'll include all the code below.

1. Create a table of slots, and a table of reservations

example tables

2. Create a matrix of reservations x slots

which is populated by true or false values based on whether that reservation-slot combination are possible

example boolean combinations matrix

3. Figure out the best mapping that allows for the most Reservation-Slot Combinations

Note: my current solution scales poorly with very large arrays as it involves looping through all possible permutations of a list with size = number of slots. I've posted another question to see if anyone can find a better way of doing this. However, this solution is accurate and can be optimized

enter image description here

Python Code Source

Part 1

from IPython.display import display
import pandas as pd
import datetime

available_data = [
    ['SlotA', datetime.time(11, 0, 0), datetime.time(12, 30, 0), set(list('ABD'))],
    ['SlotB',datetime.time(12, 0, 0), datetime.time(13, 30, 0), set(list('C'))],
    ['SlotC',datetime.time(12, 0, 0), datetime.time(13, 30, 0), set(list('ABCD'))],
    ['SlotD',datetime.time(12, 0, 0), datetime.time(13, 30, 0), set(list('AD'))],
]

reservation_data = [
    ['ReservationA', datetime.time(11, 15, 0), datetime.time(12, 15, 0), set(list('AD'))],
    ['ReservationB', datetime.time(11, 15, 0), datetime.time(12, 15, 0), set(list('A'))],
    ['ReservationC', datetime.time(12, 0, 0), datetime.time(12, 15, 0), set(list('C'))],
    ['ReservationD', datetime.time(12, 0, 0), datetime.time(12, 15, 0), set(list('C'))],
    ['ReservationE', datetime.time(12, 0, 0), datetime.time(12, 15, 0), set(list('D'))]
]

reservations = pd.DataFrame(data=reservation_data, columns=['reservations', 'begin', 'end', 'tags']).set_index('reservations')
slots = pd.DataFrame(data=available_data, columns=['slots', 'begin', 'end', 'tags']).set_index('slots')

display(slots)
display(reservations)

Part 2

def is_possible_combination(r):
    return (r['begin'] >= slots['begin']) & (r['end'] <= slots['end']) & (r['tags'] <= slots['tags'])

solution_matrix = reservations.apply(is_possible_combination, axis=1).astype(int)
display(solution_matrix)

Part 3

import numpy as np
from itertools import permutations

# add dummy columns to make the matrix square if it is not
sqr_matrix = solution_matrix
if sqr_matrix.shape[0] > sqr_matrix.shape[1]:
    # uhoh, there are more reservations than slots... this can't be good
    for i in range(sqr_matrix.shape[0] - sqr_matrix.shape[1]):
        sqr_matrix.loc[:,'FakeSlot' + str(i)] = [1] * sqr_matrix.shape[0]
elif sqr_matrix.shape[0] < sqr_matrix.shape[1]:
    # there are more slots than customers, why doesn't anyone like us?
    for i in range(sqr_matrix.shape[0] - sqr_matrix.shape[1]):
        sqr_matrix.loc['FakeCustomer' + str(i)] = [1] * sqr_matrix.shape[1]

# we only want the values now
A = solution_matrix.values.astype(int)

# make an identity matrix (the perfect map)
imatrix = np.diag([1]*A.shape[0])

# randomly swap columns on the identity matrix until they match. 
n = A.shape[0]

# this will hold the map that works the best
best_map_so_far = np.zeros([1,1])

for column_order in permutations(range(n)):
    # this is an identity matrix with the columns swapped according to the permutation
    imatrix = np.zeros(A.shape)
    for row, column in enumerate(column_order):
        imatrix[row,column] = 1

    # is this map better than the previous best?
    if sum(sum(imatrix * A)) > sum(sum(best_map_so_far)):
        best_map_so_far = imatrix

    # could it be? a perfect map??
    if sum(sum(imatrix * A)) == n:
        break

if sum(sum(imatrix * A)) != n:
    print('a perfect map was not found')

output = pd.DataFrame(A*imatrix, columns=solution_matrix.columns, index=solution_matrix.index, dtype=int)
display(output)
tinker tailor
  • 161
  • 1
  • 5
  • The optimisation for testing all combinations might be possible with [Arne's constraint programming](https://stackoverflow.com/a/44661496/476)…? Thanks for the answer, I'll dig into it and see where it leads. – deceze Jun 24 '17 at 07:43
  • yep, that would work. It's tricky to explain, but you can constrain the combinations based on the matrix of possible combinations (since you only want to pay attention to the elements with 1 in them anyways). But that might actually be more difficult to code with the constraint library, I'll try to cook something up later – tinker tailor Jun 24 '17 at 17:32
  • Yeah, I'm also having difficulties expressing all applicable constraints with Arne's answer, so I'm welcoming this alternative approach. Will play around with it as soon as I have time. – deceze Jun 24 '17 at 17:35
0

The suggested approaches by Arne and tinker were both helpful, but not ultimately sufficient. I came up with a hybrid approach that solves it well enough.

The main problem is that it's a three-dimensional issue, which is difficult to solve in all dimensions at once. It's not just about matching a time overlap or a tag overlap, it's about matching time slices with tag overlaps. It's simple enough to match slots to other slots based on time and even tags, but it's then pretty complicated to match an already matched availability slot to another reservation at another time. Meaning, this scenario in which one availability can cover two reservations at different times:

+---------+
| A, B    |
+---------+
xxxxx xxxxx
x A x x A x
xxxxx xxxxx

Trying to fit this into constraint based programming requires an incredibly complex relationship of constraints which is hardly manageable. My solution to this was to simplify the problem…

Removing one dimension

Instead of solving all dimensions at once, it simplifies the problem enormously to largely remove the dimension of time. I did this by using my existing interval tree and slicing it as needed:

def __init__(self, slots):
    self.tree = IntervalTree(slots)

def timeslot_is_available(self, start: datetime, end: datetime, attributes: set):
    candidate = Slot(start.timestamp(), end.timestamp(), dict(type=SlotType.RESERVED, attributes=attributes))
    slots = list(self.tree[start.timestamp():end.timestamp()])
    return self.model_is_consistent(slots + [candidate])

To query whether a specific slot is available, I take only the slots relevant at that specific time (self.tree[..:..]), which reduces the complexity of the calculation to a localised subset:

  |      |             +-+ = availability
+-|------|-+           xxx = reservation
  |  +---|------+
xx|x  xxx|x
  |  xxxx|
  |      |

Then I confirm the consistency within that narrow slice:

@staticmethod
def model_is_consistent(slots):
    def can_handle(r):
        return lambda a: r.attributes <= a.attributes and a.contains_interval(r)

    av = [s for s in slots if s.type == SlotType.AVAILABLE]
    rs = [s for s in slots if s.type == SlotType.RESERVED]

    p = Problem()
    p.addConstraint(AllDifferentConstraint())
    p.addVariables(range(len(rs)), av)

    for i, r in enumerate(rs):
        p.addConstraint(can_handle(r), (i,))

    return p.getSolution() is not None

(I'm omitting some optimisations and other code here.)

This part is the hybrid approach of Arne's and tinker's suggestions. It uses constraint-based programming to find matching slots, using the matrix algorithm suggested by tinker. Basically: if there's any solution to this problem in which all reservations can be assigned to a different available slot, then this time slice is in a consistent state. Since I'm passing in the desired reservation slot, if the model is still consistent including that slot, this means it's safe to reserve that slot.

This is still problematic if there are two short reservations assignable to the same availability within this narrow window, but the chances of that are low and the result is merely a false negative for an availability query; false positives would be more problematic.

Finding available slots

Finding all available slots is a more complex problem, so again some simplification is necessary. First, it's only possible to query the model for availabilities for a particular set of tags (there's no "give me all globally available slots"), and secondly it can only be queried with a particular granularity (desired slot length). This suits me well for my particular use case, in which I just need to offer users a list of slots they can reserve, like 9:15-9:30, 9:30-9:45, etc.. This makes the algorithm very simple by reusing the above code:

def free_slots(self, start: datetime, end: datetime, attributes: set, granularity: timedelta):
    slots = []
    while start < end:
        slot_end = start + granularity
        if self.timeslot_is_available(start, slot_end, attributes):
            slots.append((start, slot_end))
        start += granularity

    return slots

In other words, it just goes through all possible slots during the given time interval and literally checks whether that slot is available. It's a bit of a brute-force solution, but works perfectly fine.

deceze
  • 510,633
  • 85
  • 743
  • 889