I have a set of meetings and a participation table. For meetings I have a start and an end time, and for participation table, each row corresponds to a participant joining a meeting. The participation table includes only instances that have actually happened. I would like to find K other meetings that the participant could have joined, with the constraint that the meetings were available at the time (i.e., K other meetings that were going on at the same time).
Here is an example:
Given the two matrices below, Meeting availability:
| Meeting | Start time | End time |
| ------- | -------------------- | --------------------- |
| M1 | Nov 01, 2021 8:00 AM | Nov 01, 2021 9:00 AM |
| M2 | Nov 01, 2021 8:00 AM | Nov 01, 2021 8:45 AM |
| M3 | Nov 01, 2021 8:15 AM | Nov 01, 2021 8:45 AM |
| M4 | Nov 01, 2021 8:15 AM | Nov 01, 2021 9:00 AM |
| M5 | Nov 01, 2021 9:00 AM | Nov 01, 2021 10:00 AM |
| M6 | Nov 01, 2021 9:00 AM | Nov 01, 2021 9:45 AM |
| M7 | Nov 01, 2021 9:15 AM | Nov 01, 2021 9:45 AM |
| M8 | Nov 01, 2021 9:15 AM | Nov 01, 2021 10:00 AM |
and participation:
| User | Meeting | Joined time |
|------|---------|----------------------|
| U1 | M1 | Nov 01, 2021 8:01 AM |
| U2 | M3 | Nov 01, 2021 8:16 AM |
| U3 | M7 | Nov 01, 2021 9:16 AM |
I'd like the output to look like this (assuming K=2):
| User | Meeting | Joined time | K other meetings possible |
|------|---------|----------------------|---------------------------|
| U1 | M1 | Nov 01, 2021 8:01 AM | [M2] |
| U2 | M3 | Nov 01, 2021 8:16 AM | [M4, M2] |
| U3 | M7 | Nov 01, 2021 9:16 AM | [M8, M5] |
Basically, I would like to calculate a new column which includes K random meetings that were going on at the same time (with minute granularity) at the time participant joined a meeting.
One idea is to create a per-minute meeting availability table (similar to this solution), join it with participation table on datetime, and then use a udf to pick K random meetings. But that might result in out of memory issues. Any other idea is appreciated. Assume that per-minute availability table exists. Is there any option other than joining and filtering?