How do I group on continuous ranges

Question

I know some basic SQL, but this one is beyond me. I have looked high and low but no dice. I need a view of the following data, I can do this in the application layer code. But unfortunately for this particular one, the code must be put in the data layer.

I am using T-SQL.

Table

Date      Crew       DayType
01-02-11  John Doe  SEA  
02-02-11  John Doe  SEA  
03-02-11  John Doe  SEA  
04-02-11  John Doe  HOME  
05-02-11  John Doe  HOME  
06-02-11  John Doe  SEA

I need a view like this

DateFrom  DateTo    Name      DayType
01-02-11  03-02-11  John Doe  SEA
04-02-11  05-02-11  John Doe  HOME
06-02-11  06-02-11  John Doe  SEA

Unfortunately, the base table is required for the application layer to be in the format shown. Is this possible to do in a query?

score 36 · Accepted Answer · answered Apr 14 '11 at 11:36

36

WITH    q AS
        (
        SELECT  *,
                ROW_NUMBER() OVER (PARTITION BY crew, dayType ORDER BY [date]) AS rnd,
                ROW_NUMBER() OVER (PARTITION BY crew ORDER BY [date]) AS rn
        FROM    mytable
        )
SELECT  MIN([date]), MAX([date]), crew AS name, dayType
FROM    q
GROUP BY
        crew, dayType, rnd - rn

This article may be of interest to you:

Things SQL needs: SERIES()

answered Apr 14 '11 at 11:36

Quassnoi

413,100
91
616
614

+1, good as always. Did you examine if the ROW_NUMBER approach is faster compared to recursive one on larger sets? – Unreason Apr 14 '11 at 11:46
+1 out of interest, if there were multiple Daytypes for a crew name for a given date (i.e. add 02-02-11,John Doe,Home into sample DDL), would it be best to use DENSE_RANK for rn or is there a better way instead? – Dibstar Apr 14 '11 at 14:23
But (a) the data already exists in Relational form, (b) the view required is Relational (c) SQL handles Relational data. Why convert it to a file (complete with a Record ID), and then process it as a file, at massive resource cost ? The method for solving this Relationally, without 1970-style Record IDs or CTEs, is given in this [**Answer**](http://stackoverflow.com/a/30460263/484814) – PerformanceDBA May 28 '15 at 13:23
The Derived Table with ROW_NUMBER(), that drives the WITH. Check your SHOWPLAN. – PerformanceDBA May 28 '15 at 13:33
@PerformanceDBA: sorry, I don't get it. What does the derived table have to do with files? What am I supposed to find in the plan? – Quassnoi May 28 '15 at 13:35
(0) The data is Relational, it does not have a Record ID, it has a good PK (1) When you create a Derived Table, with a Record ID, you are converting the Relational data, to the format of a pre-1970 ISAM file, which is non-relational. (2) Then you process the file (Derived Table) using processing methods, which employs a pre-relational Record Filing System mindset. (3) Both steps have a cost, which will be visible in the plan & stats. (4) Whereas, you could simply SELECT from the Relational data, using Relational methods, which would avoid those costs. But that requires a Relational mindset. – PerformanceDBA May 28 '15 at 13:46
@PerformanceDBA: sorry, it seems you are unaware of SQL Server's architecture. SQL Server does not use any ISAM files. – Quassnoi May 28 '15 at 14:17
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/79021/discussion-between-performancedba-and-quassnoi). – PerformanceDBA May 28 '15 at 14:30
Nowhere in my comments did I suggest that SQL Server uses ISAM Files. You will be aware of course that you can create an ISAM File structure using **in** SQL Server, via CREATE TABLE or by using a Derived Table (as you have). Everywhere, I stated that **your solution** *converts* the Relational data to a ISAM File format (inside SQL Server), and then processes it using File processing methods. Instead of Relational methods over the source data, without the ISAM File steps in-between. – PerformanceDBA May 28 '15 at 15:17

score 15 · Answer 2 · answered Apr 14 '11 at 13:46

15

WITH grouped AS (
  SELECT
    *,
    grp = DATEDIFF(day, 0, Date) -
          ROW_NUMBER() OVER (PARTITION BY Crew, DayType ORDER BY Date)
  FROM @testtable
)
SELECT
  DateFrom = MIN(Date),
  DateTo = MAX(Date),
  Name = Crew,
  DayType
FROM grouped
GROUP BY Crew, DayType, grp;

Basically, same as Quassnoi's solution, but using one ROW_NUMBER fewer yields a better execution plan.

answered Apr 14 '11 at 13:46

Andriy M

76,112
17
94
154

1

Use of `DATEDIFF(day, 0, Date)` ensures that dates are continuous when grouping. IMO this is a better answer – ughai Jun 17 '15 at 08:49
Be careful because this version requires that the dates be continuous within a group - no missing days. If there's a weekend/holiday with no rows, DateDiff moves more than row_number and you get a new group, right? – cbare Nov 02 '22 at 21:53

score 0 · Answer 3 · answered Apr 14 '11 at 11:35

0

SELECT MIN(Date) AS DateFrom,MAX(Date) AS DateTo, Crew, DayType FROM yourTableName GROUP BY Crew, DayType

answered Apr 14 '11 at 11:35

asharajay

1,184
9
19

3

Doesn't this answer just gives the min & max dates for each Crew/DayType combo but not split into continuous date ranges. – Snorex Jun 19 '15 at 16:39

score -1 · Answer 4 · edited Apr 14 '11 at 11:35

-1

Something Like:

SELECT Crew, DayType, MIN(Date) AS SomeDate1, MAX(Date) AS SomeDate2 
FROM Mytable 
GROUP BY Crew, DayType

edited Apr 14 '11 at 11:35

tster

17,883
5
53
72

answered Apr 14 '11 at 11:34

Lasse Edsvik

9,070
16
73
109

score -2 · Answer 5 · answered Apr 14 '11 at 11:40

-2

Select Min(Date) DateFrom, Max(Date) DateTo, Crew Name,DayType From Mytable Group By Crew,DayType

try this.

answered Apr 14 '11 at 11:40

Sai Kalyan Kumar Akshinthala

11,704
8
43
67

This returns only two lines. `01-02-11 06-02-11 John Doe SEA` and `04-02-11 05-02-11 John Doe HOME`. – Sergio Marcelo C Figueiredo Sep 25 '17 at 21:57

How do I group on continuous ranges

5 Answers5

Linked