1

I have a MySQL 5.6 table defined thusly:

DROP TABLE IF EXISTS `ranges`;
CREATE TABLE `ranges` (
`id` int(6) unsigned NOT NULL,
`start` datetime NOT NULL,
`end` datetime NOT NULL);

INSERT INTO `ranges` (`id`, `start`, `end`) VALUES 
(1, '2017-12-16 00:00:00', '2017-12-16 04:00:00'), 
(1, '2017-12-16 12:00:00', '2017-12-16 16:00:00'), 
(1, '2017-12-16 03:00:00', '2017-12-16 13:00:00'), 
(2, '2017-12-16 00:00:00', '2017-12-16 05:00:00'), 
(2, '2017-12-16 07:00:00', '2017-12-16 14:00:00'), 
(3, '2017-12-16 00:00:00', '2017-12-16 04:00:00');

I want to merge overlapping time ranges by id and preserve non-overlapping ranges as-is. So id 1 would be collapsed into a single row, whereas ids 2 and 3 would remain as they are since id 2's rows don't overlap and id 3 obviously can't since it has only one row. The resulting output should be:

1, 2017-12-16 00:00:00, 2017-12-16 16:00:00
2, 2017-12-16 00:00:00, 2017-12-16 05:00:00 
2, 2017-12-16 07:00:00, 2017-12-16 14:00:00 
3, 2017-12-16 00:00:00, 2017-12-16 04:00:00

I've looked at related questions and fiddled with the solution here: https://www.sqlservercentral.com/Forums/Topic826031-8-1.aspx. I attempted to adapt that solution, which isn't dependent on an id column, with this attempt:

SELECT s1.id, s1.Start, 
       MIN(t1.End) AS End 
FROM ranges s1 
INNER JOIN ranges t1 ON s1.id = t1.id AND s1.Start <= t1.End
  AND NOT EXISTS(SELECT * FROM ranges t2 
             WHERE t1.id = t2.id AND t1.End >= t2.Start AND t1.End < t2.End) 
 WHERE NOT EXISTS(SELECT * FROM ranges s2 
             WHERE s1.id = s2.id AND s1.Start > s2.Start AND s1.Start <= s2.End) 
GROUP BY s1.id 
ORDER BY s1.id; 

However, simply adding the id to all the predicates doesn't give the correct result; it seems to ignore the non-overlapping ranges. Since I don't understand how the original query actually works, I'm rather out to sea as far as the correct incantation. Thank you!

SQLFiddle: http://sqlfiddle.com/#!9/6bf76b/1/0

Dan Austin
  • 79
  • 1
  • 4
  • This is really painful in MySQL. Upgrade to MySQL 8.0 and then ask again. – Gordon Linoff Dec 17 '17 at 01:27
  • @GordonLinoff Well, I'm heartened that a man of your reputation thinks so; it means I'm not a complete idiot for not seeing an obvious solution. Unfortunately I must implement this in MySQL 5.6, as upgrading the system will require testing, client review, and much corporate process that will not happen in this product iteration. If no one ends up tackling this question for 5.6, I'll mark an 8.0 solution as best answer. – Dan Austin Dec 17 '17 at 01:44
  • @DanAustin: Adding `s1.start` to the `GROUP BY` at your fiddle appears to achieve the output that you listed. See [fiddle](http://sqlfiddle.com/#!9/6bf76b/20) – Paul T. Dec 17 '17 at 04:11
  • there is a set of approaches: using variables(https://stackoverflow.com/a/8120432/2071697), recursive CTE(https://stackoverflow.com/a/19284250/2071697) and even using counter how I would do in client language(https://stackoverflow.com/a/28372148/2071697); also I've found interesting solution on different resource: https://dba.stackexchange.com/questions/100965/combining-separate-ranges-into-largest-possible-contiguous-ranges – skyboyer Dec 17 '17 at 08:43
  • and another example of using window functions: https://stackoverflow.com/a/14807943/2071697 – skyboyer Dec 17 '17 at 08:46
  • @PaulT: having Fiddled with various inputs, I do believe that does the trick! I was close, but I still need to drink a lot of coffee to understand *why* this whole thing works. In any case, if you post this as an Answer, I'd be happy to accept it so that you get credit. – Dan Austin Dec 18 '17 at 23:15
  • @DanAustin . . . Although you can get a query similar to yours to work for your particular data, it won't work if two records start at the same time and both records start a new period. – Gordon Linoff Dec 19 '17 at 03:40
  • I'll hold on the answer for I wasn't sure that it would work with all cases, for I merely "fiddled" (pun intended) with the query as-is without adding any additional data entries. As @GordonLinoff has pointed out, there are situations where it will not work as expected, even with that change. – Paul T. Dec 19 '17 at 04:04
  • @GordonLinoff: Hmm, I'm not sure what you mean by starting at the same time and starting a new period. My interpretation of that still seems to work in testing, but again since I don't understand the original query, my faith in the correctness isn't strong enough to trust production code to it. I can do this outside the database and just generate the SQL for a temporary table that contains the correct results and at least sleep at night that it works correctly. – Dan Austin Dec 20 '17 at 05:04
  • @PaulT: very well then, I'll just go with a workaround. Thanks. – Dan Austin Dec 20 '17 at 05:06
  • I am also trying to see how that would not work being grouped by both id and start date, as it seems duplicate start dates could exist, but I'm not sure, I'd have to revisit the fiddle to see if I can duplicate what @GordonLinoff mentions. – Paul T. Dec 21 '17 at 01:35

0 Answers0