We have a table with events (as in calendar event with start and end times) that is regularily queried:
TABLE event (
`id` varchar(32) NOT NULL,
`start` datetime,
`end` datetime,
`derivedfrom_id` varchar(32),
`parent_id` varchar(32) NOT NULL
)
- The
parent_id
points to a calendar table that provides some additional information. - Some of the events were created out of another event and hence have a reference pointing to that "origin" event via the
derivedfrom_id
column.
When retrieving a set of events, we usually query by date (start
/end
) and calendar (parent_id
) and limit the number of results via limit
for paging.
The problem we are now facing: sometimes we need to merge related events for the user into a single representation. So we do our normal query
SELECT id, start, parent_id
FROM event
WHERE parent_id in (<list of calendars>)
AND start >= 'some date'
LIMIT x
... and then filter out the original events, because the derivates have different information and refer to their origins anyways.
As you might have seen (sooner than we did), we do the limit before the filtering and thus receive a set of events with smaller cardinality than what we initially anticipated, i.e. the number of results is lower than 'x' after the filtering.
The only thing I could think of is to duplicate the query and do a sub-select:
SELECT id, start, parent_id
FROM event
WHERE parent_id in (<list_of_calendars>)
AND start >= 'some date'
AND (/* the part below duplicates the previous conditions */
derivedfrom_id is not null
or id not in (
SELECT derivedfrom_id
FROM event
WHERE parent_id in (<list_of_calendars>)
AND start >= 'some date'
AND derivedfrom_id is not null
)
)
LIMIT x
But I hardly believe that this is the only way to do this. Especially, since our query is much more complicated.
Is there a better way?
Example Data
(as requested in a comment)
Given these three events:
│ *ID* │ *DERIVEDFROM_ID* │ *PARENT_ID* │ *START*
├──────┼──────────────────┼─────────────┼─────────────────
│ 100 │ - │ A │ 2014-11-18 15:00
│ 101 │ 100 │ B │ 2014-11-18 15:00
│ 150 │ - │ A │ 2014-11-20 08:00
... and a limit of 2, I want to get events 101 and 150.
Instead, with the current approach:
- The query with a limit of 2 results in events 100 and 101
- After filtering, event 100 is discarded and the only remaining event is 101
Note on Expected Answer
The SQL above is actually generated from a Java application that uses JPA. My current solution is to generate a where clause and duplicate it. If there is something generic JPA-specific, I would appreciate any pointers.