Optimizing Optapy constraints and possibly redefining classes

Question

I started using OptaPy last week, I've read through the documentation and watched a few videos. I have a rough understanding of how things work in the big picture - but I would appreciate any input and tips for optimization.

I am trying to generate a schedule similar to "employee-scheduling example" on github a few different conditions. So I used this project as a baseline.

The problem is the following:

I have two locations, each with their own min and max shifts.
Location 1 - min employees per shift = 3, max employees per shift = 4
Location 2 - min employees per shift = 2, max employees per shift = 4

Each location has their own timetable of shifts.
Location 1 - Monday to Friday (08:00 to 16:00) - 5 x 2 hour shifts a day.
Location 2 - Monday to Friday (08:00 to 18:00) - 6 x 2 hour shifts a day. Saturday (08:00 to 12:00) - 2 x 2 hour shifts.

Now here is where things get interesing.

So obviously each shift should have the minimum required employees assigned to it, but it would be nice to fill the optional employees too.

Initially I thought of defining the Shift Class with a list of employees. But at the time I could not find an example of this, I only found an example of 1 employee per shift. So to accomodate this, I generate from the database for each timeslot, 4 shifts and set a employee_required boolean.
Eg. Location 1 - Monday 08:00 to 10:00 - (since location has max employees of 4 and min employees of 3) I create 4 shifts for this timeslot, 3 of them have required set to true and 1 of them required is set to false.

Currently, I am not using this information (since I don't know how to focus OptaPy on the required shifts first and the optional shifts later to get the best result/score).

As of now I have only 2 constraints.

A employee must be available to do that shift. How I calculate this, the employee has an array of timeslot-hashes that he/she can do.

 employee.timeslots_unique_hashes = ["Monday-08:00-10:00", "Thursday-12:00-14:00"]

And then I have a unique hash for the shift - "Monday-08:00-10:00". So if the shift hash is not in the employees list of hashes. I penalise based on the shift length. Below is my constraint

def required_timeslot(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each(Shift) \
        .filter(lambda shift: shift.unique_hash not in shift.employee.timeslots_unique_hashes) \
        .penalize("Employee missing timeslot", HardSoftScore.ONE_HARD, lambda shift: get_shift_duration_in_minutes(shift))

A employee can only have 1 shift per day across all locations.

def one_shift_per_day(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each_unique_pair(Shift,
              Joiners.equal(lambda shift: shift.employee),
              Joiners.equal(lambda shift: shift.day_name)
        ) \
        .penalize("Max one shift per day", HardSoftScore.ONE_HARD)

So where I need advice or help is in the following:

(Optional) Implement the required boolean on a shift - focus on required shifts to be filled with less priority on the optional to improve score.
(Required) Constraint - A employee has a array of other employee ids that are required. So there might be a husband and wife that are required to be together. Or mother and daughter. I tried writing the constraint but need help completing it.

The logic here would be to group the shifts by their unique hash, cycle through the employees and build an array of required employees for that shift. If the array employees assigned to that shift does not match the array of required employees, penalise based on the amount of employees missing.

def required_employees(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each(Shift) \
        .groupBy(lambda shift: shift.unique_hash, to_list(lambda shift: shift.employee)) \
        .filter(lambda shift, employees: shift_does_not_contains_all_required_employees(shift)) \
        .penalize("Shift does not contain all the required employees", HardSoftScore.ONE_HARD, lambda shift: get_missing_required_employees_count(shift))

(Required but not sure possible) Constraint - ensure that all the employees get atleast one shift. Obviously some employees have the possibility to get 1 shift per day, but I want to prioritise that everyone get atleast one shift before giving someone that already has a shift another one.

Any help/advice/suggestions would be highly appreciated. Thank you

score 0 · Answer 1 · answered Oct 19 '22 at 03:37

Your constraint for "Employee missing timeslot" look correct. It does differs from how the optapy quickstart for employee scheduling (https://github.com/optapy/optapy-quickstarts/tree/stable/employee-scheduling) handles it. In the quickstart, we use availability objects instead, and join shifts with that employee with the availability object, either penalizing by hard if it unavailable, penalizing by soft if it unwanted, or rewarding by soft if it wanted:

def unavailable_employee(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each(Shift) \
        .join(Availability,
              Joiners.equal(lambda shift: shift.employee,
                            lambda availability: availability.employee),
              Joiners.equal(lambda shift: shift.start.date(),
                            lambda availability: availability.date)
              ) \
        .filter(lambda shift, availability: availability.availability_type == AvailabilityType.UNAVAILABLE) \
        .penalize('Unavailable employee', HardSoftScore.ONE_HARD,
                  lambda shift, availability: get_shift_duration_in_minutes(shift))


def desired_day_for_employee(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each(Shift) \
        .join(Availability,
              Joiners.equal(lambda shift: shift.employee,
                            lambda availability: availability.employee),
              Joiners.equal(lambda shift: shift.start.date(),
                            lambda availability: availability.date)
              ) \
        .filter(lambda shift, availability: availability.availability_type == AvailabilityType.DESIRED) \
        .reward('Desired day for employee', HardSoftScore.ONE_SOFT,
                lambda shift, availability: get_shift_duration_in_minutes(shift))


def undesired_day_for_employee(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each(Shift) \
        .join(Availability,
              Joiners.equal(lambda shift: shift.employee,
                            lambda availability: availability.employee),
              Joiners.equal(lambda shift: shift.start.date(),
                            lambda availability: availability.date)
              ) \
        .filter(lambda shift, availability: availability.availability_type == AvailabilityType.UNDESIRED) \
        .penalize('Undesired day for employee', HardSoftScore.ONE_SOFT,
                  lambda shift, availability: get_shift_duration_in_minutes(shift))

This model decouples availability from employee, allowing them to be more easily stored in a database. There is nothing wrong with your current approach though if it more convenient for your use case.

Your constraint for "A employee can only have 1 shift per day across all locations." is correct and is also how the employee scheduling quickstart does it.
"Implement the required boolean on a shift - focus on required shifts to be filled with less priority on the optional to improve score." This can be done by making the employee planning variable nullable and changing the score type to HardMediumSoftScore. Additionally, the Shift class will have a bool field that is True if it is required, False otherwise. Your Shift class would look like this:

@optapy.planning_entity
class Shift:
    employee: Employee | None
    is_required: bool
    
    # ...
    # other fields, __init__, etc.
    # ...
    
    @optapy.planning_variable(Employee, value_range_provider_refs=['employee_range'], nullable=True)
    def get_employee(self):
        return self.employee

    def set_employee(self, employee):
        self.employee = employee

And you have two constraints for Shift without employees:

def required_shift_missing_employee(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each_including_null_vars(Shift) \
        .filter(lambda shift: shift.is_required and shift.employee is None) \
        .penalize('Required shift missing employee', HardMediumSoftScore.ONE_HARD)

def required_shift_missing_employee(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each_including_null_vars(Shift) \
        .filter(lambda shift: (not shift.is_required) and shift.employee is None) \
        .penalize('Optional shift missing employee', HardMediumSoftScore.ONE_MEDIUM)

These constraints make required shifts missing employees penalize by a hard score, and optional shifts missing employees penalize by a medium score. Since 1 hard > any medium, optapy will focus on filling required shifts first, then as many optional shifts it can do.

Your constraint for "Shift does not contain all the required employees" look correct. I would probably do something like this:

def required_employees(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each(Shift) \
        .join(Shift,
              Joiners.equal(lambda shift: shift.unique_hash),
              Joiners.filtering(lambda shift_1, shift_2: shift_2.employee in shift_1.required_coworker_list) \
        .group_by(lambda shift_1, shift_2: shift_1,
                  ConstraintCollectors.count_distinct(lambda shift_1, shift_2: shift_2.employee)) \
        .filter(lambda shift, number_of_required_coworkers_sharing_shift: number_of_required_coworkers_sharing_shift < len(shift.employee.required_coworker_list)) \
        .penalize("Shift does not contain all the required employees",
                  HardSoftScore.ONE_HARD,
                  lambda shift, number_of_required_coworkers_sharing_shift: len(shift.employee.required_coworker_list) - number_of_required_coworkers_sharing_shift)

What this constraint does is:

Get all shifts (shift_1)
Join each shift (shift_1) with shifts (shift_2) where shift_1 and shift_2 share the same timeslot and shift_2.employee is in shift_1.required_coworker_list
Count the number of distinct shift_2.employee for each group of shift_1
Filter to only include matches where the number of distinct employees is less than len(shift_1.employee.required_coworker_list). Since the employees for shift_2 can only be members of shift_1.employee.required_coworker_list, this condition is matched if and only if not all of the employee's required coworker share the shift with them.
Penalize by len(shift_1.employee.required_coworker_list) - number_of_coworkers_sharing_shift, which is the number of missing employees.

"Ensure that all the employees get at least one shift." This constraint is tied strongly to fairness. I recommend reading this blog post about fairness in OptaPlanner (https://www.optaplanner.org/blog/2017/02/03/FormulaForMeasuringUnfairness.html). "Ensure that all the employees get at least one shift." is possible, but it will probably not do what you want. This constraint (all employee has at least one shift) is below:

def all_employees_have_at_least_one_shift(constraint_factory: ConstraintFactory):
    return constraint_factory \
        .for_each(Employee) \
        .if_not_exists(Shift, Joiners.equal(lambda employee: employee, lambda shift: shift.employee)) \
        .penalize("Employee has no shift", HardSoftScore.ONE_HARD)

For how to actually do fairness, see this stack overflow answer: https://stackoverflow.com/a/74018711/9698517 ; (Note: there is currently a bug in OptaPlanner (https://issues.redhat.com/browse/PLANNER-2834) that might cause an exception to be thrown using that code).

A note on changing the score type: you need to do it everywhere; you cannot mix both HardMediumSoftScore and HardSoftScore. So you need to modify the penalize/reward for all of your constraints to use HardMediumSoftScore, and also change the type passed to @planning_score in the @planning_solution class.

Note on PLANNER-2834: although this is a completely valid CS use case and we should technically support it (and we don't), I'd argue it is needlessly complex and inefficient. All those groupBy and join calls can be replaced by a single groupBy with a custom constraint collector implementation. That would be the preferred approach here, both in terms of complexity and runtime performance. — Lukáš Petrovický, Oct 19 '22 at 09:47
Thank you very much Christopher. I really appreciate your time in helping. I will have a look at implementing this as soon as possible, and get back to you whether I was successful or not. — Jeandrew Swart, Oct 20 '22 at 04:43

Optimizing Optapy constraints and possibly redefining classes

1 Answers1