6

I have table with the following column:

[name_of_pos] varchar,
[date_from] datetime,
[date_to] datetime

Below is my sample data:

name_of_pos  date_from                 date_to
----------------------------------------------------------------
Asystent     2015-08-26 08:57:49.000   2015-09-04 08:57:49.000
Biuro        2015-09-01 08:53:32.000   2015-09-01 08:53:32.000
Biuro        2015-09-02 09:00:41.000   2015-09-02 09:00:41.000
Biuro        2015-09-03 11:46:03.000   2015-09-03 11:46:03.000
Biuro        2015-09-10 09:02:11.000   2015-09-15 09:02:11.000
Koordynator  2015-09-01 09:04:06.000   2015-09-01 09:04:06.000
Projektant   2015-08-31 08:59:46.000   2015-09-01 08:59:46.000
Projektant   2015-09-02 08:00:54.000   2015-09-02 08:00:54.000
Projektant   2015-09-14 12:34:50.000   2015-09-14 12:34:50.000

What I want to return is the date range (min of date_from to max of date_to) for each name_of_pos, but only where the date values are continuous (the time part is not important and can be ignored in result).

The desired output would be:

  name_of_pos   date_from   date_to
  ------------------------------------
  Asystent      2015-08-26  2015-09-04 
  Biuro         2015-09-01  2015-09-03 
  Biuro         2015-09-10  2015-09-15 
  Koordynator   2015-09-01  2015-09-01 
  Projektant    2015-08-31  2015-09-02 
  Projektant    2015-09-14  2015-09-14 

I tried a solution using something similar to this question:

How do I group on continuous ranges

But had no luck as I have two datetime columns.

Community
  • 1
  • 1
Bart
  • 65
  • 1
  • 2
  • 4
  • Are you sure the source data you presented is correct? If so, can you explain how you got the `Biuro 2015-09-01 08:53:32.000 2015-09-03 11:46:03.000` interval? There does not seem to be a continuous time interval in the source data for this record. – Cristian Lupascu Sep 02 '15 at 07:29
  • The date part must be continous,time part is not important (can be ignored) so three records for biuro ( where date_from is 2015-09-01,2015-09-02 and 2015-09-03) should be return as one row for name_of_pos='Biuro' with values date_from='2015-09-01' and date_to='2015-09-03' – Bart Sep 02 '15 at 07:33

4 Answers4

4

Here's a solution using a cte to iterate over the rows (after they have been ordered) and check for consecutive days before grouping:

-- dummy table
CREATE TABLE #TableA
    (
      [name_of_pos] VARCHAR(11) ,
      [date_from] DATETIME ,
      [date_to] DATETIME
    );

-- insert dummy data
INSERT  INTO #TableA
        ( [name_of_pos], [date_from], [date_to] )
VALUES  ( 'Asystent', '2015-08-26 08:57:49', '2015-09-04 08:57:49' ),
        ( 'Biuro', '2015-09-01 08:53:32', '2015-09-01 08:53:32' ),
        ( 'Biuro', '2015-09-02 09:00:41', '2015-09-02 09:00:41' ),
        ( 'Biuro', '2015-09-03 11:46:03', '2015-09-03 11:46:03' ),
        ( 'Biuro', '2015-09-10 09:02:11', '2015-09-15 09:02:11' ),
        ( 'Koordynator', '2015-09-01 09:04:06', '2015-09-01 09:04:06' ),
        ( 'Projektant', '2015-08-31 08:59:46', '2015-09-01 08:59:46' ),
        ( 'Projektant', '2015-09-02 08:00:54', '2015-09-02 08:00:54' ),
        ( 'Projektant', '2015-09-14 12:34:50', '2015-09-14 12:34:50' );

-- new temp table used to add row numbers for data order
SELECT  name_of_pos, CAST(date_from AS DATE) date_from, CAST(date_to AS DATE) date_to,
        ROW_NUMBER() OVER ( ORDER BY name_of_pos, date_from ) rn
INTO    #temp
FROM    #TableA

-- GroupingColumn in cte used to identify and group consecutive dates
;WITH    cte
          AS ( SELECT   name_of_pos ,
                        date_from ,
                        date_to ,
                        1 AS GroupingColumn ,
                        rn
               FROM     #temp
               WHERE    rn = 1
               UNION ALL
               SELECT   t2.name_of_pos ,
                        t2.date_from ,
                        t2.date_to ,
                        CASE WHEN t2.date_from = DATEADD(day, 1, cte.date_to) 
                                  AND cte.name_of_pos = t2.name_of_pos
                             THEN cte.GroupingColumn
                             ELSE cte.GroupingColumn + 1
                        END AS GroupingColumn ,
                        t2.rn
               FROM     #temp t2
                        INNER JOIN cte ON t2.rn = cte.rn + 1
             )
    SELECT  name_of_pos, MIN(date_from) AS date_from, MAX(date_to) AS date_to
    FROM    cte
    GROUP BY name_of_pos, GroupingColumn

DROP TABLE #temp
DROP TABLE #TableA

Produces your desired output:

name_of_pos date_from   date_to
Asystent    2015-08-26  2015-09-04
Biuro       2015-09-01  2015-09-03
Biuro       2015-09-10  2015-09-15
Koordynator 2015-09-01  2015-09-01
Projektant  2015-08-31  2015-09-02
Projektant  2015-09-14  2015-09-14
Tanner
  • 22,205
  • 9
  • 65
  • 83
  • Works great.thank you! I have one more question -is it possible to create view with this query ? – Bart Sep 02 '15 at 08:14
  • @Bart yes, you should be able to do it [like this](http://dba.stackexchange.com/questions/7239/transact-sql-using-with-in-create-view) – Tanner Sep 02 '15 at 08:25
  • I am getting a recursion error (from the CTE continually joining on itself) because I have a bigger data set. Do you have any recommendations? – thalacker Mar 15 '22 at 16:28
1

You could use cte for that, but from my experience, the fastest way to do it is to use update in a loop:

declare @temp table
(
  name_of_pos varchar(128),
  date_from datetime,
  date_to datetime
)

insert into @temp (
    name_of_pos, date_from, date_to
) 
values
    ('Asystent', '2015-08-26 08:57:49', '2015-09-04 08:57:49'),
    ('Biuro', '2015-09-01 08:53:32', '2015-09-01 08:53:32'),
    ('Biuro', '2015-09-02 09:00:41', '2015-09-02 09:00:41'),
    ('Biuro', '2015-09-03 11:46:03', '2015-09-03 11:46:03'),
    ('Biuro', '2015-09-10 09:02:11', '2015-09-15 09:02:11'),
    ('Koordynator', '2015-09-01 09:04:06', '2015-09-01 09:04:06'),
    ('Projektant', '2015-08-31 08:59:46', '2015-09-01 08:59:46'),
    ('Projektant', '2015-09-02 08:00:54', '2015-09-02 08:00:54'),
    ('Projektant', '2015-09-14 12:34:50', '2015-09-14 12:34:50')

----------------------------------------------------------------------------------------------------
declare @temp_new table (
  name_of_pos varchar(128),
  date_from date,
  date_to date
)

insert into @temp_new (
    name_of_pos, date_from, date_to
)
select
    name_of_pos, date_from, date_to
from @temp

while @@rowcount > 0
begin
    update t1 set
        date_to = t2.date_to
    from @temp_new as t1
        inner join @temp_new as t2 on
            t2.name_of_pos = t1.name_of_pos and
            dateadd(dd, 1, t1.date_to) = t2.date_from
end

select name_of_pos, min(date_from), date_to
from @temp_new
group by name_of_pos, date_to
order by name_of_pos, date_to
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
1

This is a gaps and islands issue. This is the tuned official way to accomplish it and this would be checked as solution:

;with 

cte as (
    SELECT *,
          dateadd( day,
           - (ROW_NUMBER() OVER (
               partition by name_of_pos
               ORDER BY t.date_from
              ) +                           -- here starts tuned part --
              isnull( 
               sum( datediff(day, date_from, date_to ) ) OVER (
                partition by name_of_pos
                ORDER BY t.date_from
                ROWS BETWEEN UNBOUNDED PRECEDING and 1 PRECEDING
              ) ,0)                         -- here ends tuned part --
           ),
           date_from
          ) as Grp
    FROM t
)  

SELECT name_of_pos
    ,min(date_from) AS date_from
    ,max(date_to) AS date_to
FROM cte
GROUP BY name_of_pos, Grp
ORDER BY name_of_pos, date_from

Here tested on sqlfiddle (with some few different sample data).

dani herrera
  • 48,760
  • 8
  • 117
  • 177
  • ( notice than for your structure you must to cast datetime to date ) – dani herrera Sep 02 '15 at 11:16
  • The only downside that it couldn't be applied to overlapping periods, but in case of consecutive periods this should work quite nice. I'll check this on substantial amount of rows and 'll be back – Roman Pekar Sep 02 '15 at 13:37
  • and also you probably want to mention that this will work only in SQL Server 2012+ – Roman Pekar Sep 02 '15 at 13:46
-2

Try this once:

SELECT name_of_pos, date_from,date_to
FROM table
ORDER BY
name_of_pos asc, date_from desc;