11

The Problem

Here's the essence of the problem I want to solve. We have workers taking care of children in a nursery for set times during the weekend. There's 16 different slots to fill in one weekend. So for a 4-week month there's 64 slots to fill. We have at max 30 nursery workers (though we need much more. anybody like kids?).

EDIT: Each time slot is discrete - they don't overlap.

Currently there's a person who comes up with the nursery schedule each month. It's a complex and time consuming task to make this schedule every month with everybody's preferences. After considering the problem I thought to myself, "There must be a better way!"

Algorithms

I notice that the problem is essentially a bipartite graph. The marriage problem is also a bipartite graph which you can solve by using a matching algorithm like Edmonds's matching algorithm.

But this assumes that each node in one node set matches just one node in the other node set. In my case, each nursery worker would work only one time slot. As we're seriously understaffed, that won't work! A bunch of people are going to have to work twice a month to fill up all the time slots.

Which seems to mean that this is more like the classic "hospitals/residents problem". It differs from the marriage problem in that the "women" can accept "proposals" from more than one "man" (e.g., a hospital can take multiple residents). In my case, a nursery worker can take more than one time slot.

What now?

Whew!

Now that I have the set up out of the way....does any one know of any good links that explain or show such an algorithm? Are there better ways to go about solving this? Am I over thinking it? I did a google search for "hospital residents algorithms" and found grad student papers. Gah! I graduated with a CS degree and took an AI class...but that was 6 years ago. Help!

Aaaaany advice is appreciated!!

JonH
  • 821
  • 11
  • 19
  • 4
    sounds like the http://en.wikipedia.org/wiki/Assignment_problem to me and even **O** (*x^n*) brute force solutions are feasible for sufficiently small *n* (e.g. 64) – msw Sep 23 '10 at 05:59
  • Do the slots of a single weekend overlap in time? – Svante Sep 23 '10 at 09:34
  • As I understand it is a sxn 64x30 matrix with 1/0 values - 1 means that the worker can fill the time slot. The goal is to minimize the maximum of |n| vectors while each |s|>0 – HamoriZ Sep 23 '10 at 12:46
  • @msw - would brute force mean iterating through all possible assignments, finding the matches, and picking the one that has the most assignments? @Svante - edited the description, no they don't ovelap. – JonH Sep 23 '10 at 19:34

2 Answers2

4

The "hospitals/residents problem" could indeed work but it depends of your constraints :

  • Hospital have a maximum capacity and will order the resident (most wanted to less wanted).
  • Residents will order hospitals.
  • No other constraints possible.

In your case the hospitals are workers and the residents are slots.

  • Slots can order workers (maybe you prefer experimented ones in the morning...).
  • Workers can order slots.
  • But you can't have other constraints such as : "I've worked in the morning, I don't want to work the same day in the evening".

If that's ok for you then you have to possibilities :

  • you want to advantage workers : "hospital oriented case".

    You will try to assign workers to their preferred slot(s).

  • you want to advantage slots : "resident oriented case"

    Each slot will have their preferred workers.

I had to code it last year, here is the code.

/* 
RO : needed for Resident-Oriented version
HO : needed for Hospital-Oriented version
*/
const int MAX_R = 1000;
const int MAX_H = 1000;
const int INF = 1000*1000*1000;

You need to fill the input variables. Everything is straightforward :

  • R_pref and H_pref are the list of preferences for residents/hospitals
  • H_rank[h][r] is the rank of r in H_pref[h] : the position of r in the preference list of h

That's all.

// Input data
int R, H;                   // Number of Residents/Hospitals
int C[MAX_H];               // Capacity of hospitals
vector<int> R_pref[MAX_R], H_pref[MAX_H]; // Preferences : adjency lists
/*RO*/int H_rank[MAX_H][MAX_R];   // Rank : rank of r in H_pref[h]
/*HO*/int R_rank[MAX_R][MAX_H];   // Rank : rank of h in R_pref[r]

No need to touch below.

// Internal data
int RankWorst[MAX_H];   // Rank of the worst r taken by h
/*RO*/int BestH[MAX_R];       // Indice of the best h in R_pref the r can get
/*HO*/int BestR[MAX_H];       // Indice of the best r in H_pref the h can get
int Size[MAX_H];        // Number of residents taken by h

// Output data
int M[MAX_R];

void stable_hospitals_RO()
{
    for(int h = 0 ; h < H ; h++)
      RankWorst[h] = H_pref[h].size()-1;
    fill_n(BestH, R, 0);
    fill_n(Size, H,0);
    fill_n(M,R,INF);
    for (int i = 0; i < R; i++)
        for (int r = i; r >= 0;)
        {
        if(BestH[r] == int(R_pref[r].size()))
            break;
            const int h = R_pref[r][BestH[r]++];
            if(Size[h]++ < C[h])
            {
                M[r] = h;
                break;
            }
            int WorstR = H_pref[h][RankWorst[h]];
            while(WorstR == INF || M[WorstR] != h) // Compute the worst
                WorstR = H_pref[h][--RankWorst[h]];
            if(H_rank[h][r] < RankWorst[h])        // Ranked better that worst
            {
                M[r] = h;
                M[r = WorstR] = INF;    // We have eliminate it, he need to put it somewhere
            }
        }
}
void stable_hospitals_HO()
{
    fill_n(BestR, H, 0);
    fill_n(Size, H,0);
    fill_n(M,R,INF);
    vector<int> SH;
    for (int h = 0; h < H; h++)
        SH.push_back(h);
    while(!SH.empty())
    {
        int h = SH.back();
        if(Size[h] == C[h] || BestR[h] == int(H_pref[h].size())) // Full or no r available
        {
            SH.pop_back();
            break;
        }
    const int r = H_pref[h][BestR[h]++];
    // r is unassigned or prefer h to current hospital
        if(M[r] == INF || R_rank[r][h] < R_rank[r][M[r]]) 
        {
            if(++Size[h] == C[h]) // Will be full
                SH.pop_back();
            if(M[r] != INF) // Delete from M[r]
            {
                Size[M[r]]--;
                SH.push_back(M[r]);
            }
            M[r] = h;
        }
    }
}

Example of use to show how to build rank from prefs. (In that case the preference lists were on the stdin).

int main()
{
    scanf("%d%d",&R,&H);
    int num;
    // put inf

    for(int r = 0 ; r < R ; r++)
    {
        scanf("%d",&num);
        R_pref[r].resize(num);
        for(int h = 0 ; h < num ; h++)
        {
            scanf("%d",&R_pref[r][h]);
            R_rank[r][R_pref[r][h]] = h;
        }
    }
    for(int h = 0 ; h < H ; h++)
    {
        scanf("%d",&C[h]);
        scanf("%d",&num);
        H_pref[h].resize(num);
        for(int r = 0 ; r < num ; r++)
        {
            scanf("%d",&H_pref[h][r]);
            H_rank[h][H_pref[h][r]] = r;
        }
    } 
    stable_hospitals_RO();
    printf("\n\n\n\n");
    stable_hospitals_HO();
    return 0;
}

On an example : Hospitals 1 to 3, 6 résidents.

H_pref :

  • 1 -> 2 5 6 1 (prefers 2 then 5 then 6 then 1)
  • 2 -> 4 2 1 6 3 5
  • 3 -> 1 2

R_pref :

  • 1 -> 1 2 3
  • 2 -> 3 1
  • 3 -> 2 1
  • 4 -> 1 3 2
  • 5 -> 3 2 1
  • 6 -> 3

H_rank :

  • 1 -> 4 1 INF INF 2 3 (1 is in position 4 in H_pref[1], 3 is not theree)
  • 2 -> 3 2 5 1 6 4
  • 3 -> 1 2 INF INF INF INF

Similar for R_rank.

Hospital don't have to rank everyone et can also rank less people than their capacity.

Loïc Février
  • 7,540
  • 8
  • 39
  • 51
  • Nice! I think my constraints will fit those rules. I'd go with the "hospital oriented case". A few questions on the code. 1) Where's the 1000 come from? To have an upper bound on the size of the arrays? 2) How are you using the R_pref and H_pref vectors as multi-dimensional? Looks like you're creating them as vectors with one index, but later you're calling them like R_pref[x][y]. 3) How do I pull the matching data out? That is, the hospitals and their assigned residents? Forgive me any of these questions are obvious - I've been doing Java/C# for a while and my C++ is a bit rusty! – JonH Sep 23 '10 at 19:58
  • 1) 1000 was a upper bound for the problem I was working on (no dynamic allocation : faster and safer when you have to re-code it in under 20 minutes in a programming ontest), just use whatever you want (but be sure it's big enough). 2) Each R_pref[i] is a vector 3) In "int M" (Output data) you have the matching hospital for each resident If you have problems using it post a link to your data (and explains how it is organized) and I will build the program. – Loïc Février Sep 23 '10 at 20:16
  • Gotcha! I'm creating an app to test as we speak, but one example input would help a ton. Let's say Judy, who is index h=1 in the hospital list, can work Saturday at 7PM, which is r=4 in the resident list. Out of 3 possible time slots she's available to work, this is her most preferred (the other two less preferred could be...say r=1 and r=6, in that order). How would I set H_pref and R_rank to reflect that? – JonH Sep 24 '10 at 05:58
  • I've added example and the main is modified. It was the one from an older version, before I use vector. Now I need to resize them before inserting in them. – Loïc Février Sep 24 '10 at 11:09
  • Pretty sure this is the answer I'm looking for - +1 for an in-depth explanation! I understand how to format the data now. My last question is: for the time slots, or residents, it dosn't really matter who works where. So I don't have any true preferences when it comes to residents. I assume the algorithm won't work correctly for 0 preferences for the residents? – JonH Sep 24 '10 at 22:50
  • Indeed if a resident (time slot) has not ranked any hospitals (workers), he will never be assigned to an hospital. You could either use random ordering or always the same. For example, ranked first people that don't mind working twice (they will more likely be choose twice). Don't forget to add a max capacity to each worker : 2 or 3 should be good in you case. – Loïc Février Sep 24 '10 at 22:59
1

In case anyone comes across a similar problem, here's the solution I went with:

I ended up using a Backtracking Search Algorithm for Constraint Satisfaction Problems. All it does it do a normal backtracking search but checks that all constraints are satisfied as it traverses the tree. Here is the pseudo-code:

function BACKTRACKING-SEARCH(csp) returns a solution, or failure
     return BACKTRACK({ }, csp)

function BACKTRACK(assignment, csp) returns a solution, or failure
     if assignment is complete then return assignment
     var = SELECT-UNASSIGNED-VARIABLE(csp)
     for each value in ORDER-DOMAIN-VALUES(var, assignment, csp) do
        if value is consistent with assignment then
           add {var = value} to assignment
           result = BACKTRACK(assignment, csp)
           if result != failure then
              return result
           remove {var = value} 
     return failure

The variable is a time slot. The possible values assigned to a variable are the workers. The combination of a variable and its actual assignment is a node in the tree. Thus the search space is all the possible combinations of time slots and workers. The constraints prune nodes from the search space.

A constraint can be a workers availability. So if time slot A is assigned Worker X, but X is not able to work in time slot A, then this node will be ruled inconsistent.

I solved the problem of a particular time slot being assigned multiple workers by considering each time slot / worker combination to be it's OWN time slot. So if the childrens' nursery has 2 workers to fill, I consider it as two individual time slots to fill, each getting assigned its own worker. That way each variable is assigned just one value. It makes for MUCH simpler algorithms.

Thanks for all the help whittling this down to a solvable problem.

JonH
  • 821
  • 11
  • 19