With a suitable source of dates (as given by answers to "Get a list of dates between two dates" or generate days from date range), the dates can be filtered based on the query date range (e.g. '2022-04-04' and '2022-04-06') and joined to the rows in the results. Each date can be joined to each row; to limit these to dates in the interval (start
, end
) for each result row from presences
, simply join between the start & end timestamps:
SET @after := '2022-04-04',
@before := '2022-04-06';
SELECT *
FROM `presences` AS p
JOIN (SELECT * FROM `days` WHERE `day` BETWEEN @after AND @before)
AS d ON d.`day` BETWEEN Date(p.`start`) AND Date(p.`end`) -- The core of the solution
WHERE Date(start) <= @before AND Date(end) >= @after
ORDER BY id, d.`day`
;
The above uses a sub-select to filter the days. This can also be done by filtering in the join, or the WHERE
clause (as can also be seen in a fiddle):
SELECT p.id, p.employee, p.location, d.day
FROM `presences` AS p
JOIN `days` AS d ON d.`day` BETWEEN Date(p.`start`) AND Date(p.`end`)
WHERE Date(`start`) <= @before AND Date(`end`) >= @after
AND d.day BETWEEN @after AND @before
ORDER BY id, d.`day`
;
SELECT p.id, p.employee, p.location, d.day
FROM `presences` AS p
JOIN `days` AS d
ON d.`day` BETWEEN @after AND @before
AND d.`day` BETWEEN Date(p.`start`) AND Date(p.`end`)
WHERE Date(`start`) <= @before AND Date(`end`) >= @after
ORDER BY id, d.`day`
;
Which to use is a matter of readability and potentially efficiency. For the example queries above, an EXPLAIN
of each shows the same execution plan, so there shouldn't be any efficiency differences:
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------------------------------------------+
| 1 | SIMPLE | p | NULL | ALL | NULL | NULL | NULL | NULL | 4 | 100 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | d | NULL | range | day | day | 4 | NULL | 4 | 100 | Using where; Using index; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------------------------------------------------+
A Note on Column Types
If the start
and end
columns will actually only ever hold dates (i.e. the time portion will always be 00:00:00
), they can be redefined as the DATE
type, and the calls to the Date()
function can be removed from the query, simplifying it. You could get away with not redefining the columns and merely removing the calls to Date()
, as the various date-time columns are comparable as-is, though this may cause problems should any of the TIMESTAMP
columns have a time other than 00:00:00
.
The primary advantage of removing functions is to allow indices (here, on start
and end
) to be used. The example schema didn't index start
and end
, so there wouldn't be any difference in execution time or plans. If, however, you add an index:
CREATE INDEX presence_interval ON presences (`start`, `end`);
then the execution plans for the queries with the calls to Date()
removed are all:
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+------+------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+------+------+----------+-----------------------------------------------------------------+
| 1 | SIMPLE | p | NULL | range | presence_interval | presence_interval | 5 | NULL | 2 | 33.33 | Using index condition; Using temporary; Using filesort |
| 1 | SIMPLE | d | NULL | range | day | day | 4 | NULL | 4 | 100 | Using where; Using index; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+------+------+----------+-----------------------------------------------------------------+
Note there are fewer rows from the presences
column, and the filtered
percentage is cut by 2/3.