SQL split range of dates in Redshift

Question

I have a table with date ranges like

----------------------------------------------------------------
|     id    | date_start               | date_end              |
----------------------------------------------------------------
|     1     | 2017-02-03 08:00:00.000 | 2017-02-03 17:00:00.000|
|     2     | 2017-02-04 15:00:00.000 | 2017-02-05 10:00:00.000|
|     3     | 2017-02-06 14:00:00.000 | 2017-02-07 23:00:00.000|
----------------------------------------------------------------

As you can see, some of the ranges can cover more than 1 day period (like #2, #3), and I need to separate such records by days to have a result like:

----------------------------------------------------------------
|     id    | date_start               | date_end              |
----------------------------------------------------------------
|     1     | 2017-02-03 08:00:00.000 | 2017-02-03 17:00:00.000|
|     2     | 2017-02-04 15:00:00.000 | 2017-02-04 23:59:59.999|
|     2     | 2017-02-05 00:00:00.000 | 2017-02-05 10:00:00.000|
|     3     | 2017-02-06 14:00:00.000 | 2017-02-06 23:59:59.999|
|     3     | 2017-02-07 00:00:00.000 | 2017-02-07 23:00:00.000|
----------------------------------------------------------------

How can I do it with SQL on Redshift?

Juan Carlos Oropeza · Answer 1 · 2017-12-17T13:38:09.020

I create a list of number to create 1 day ranges. In this case I create 100 days
I keep lot of columns only for debug
Create 1 day ranges new_start and new_end
The base case is one the first interval and if both dates are on the same day then you dont need change anything
Now on the first interval I select the original date_start same as the last interval I use date_end
For the rest I use new_start and new_end - 1 second
And only do that for one day ranges overlaping your original range

SQL DEMO

WITH days as  (
    SELECT a.n
    from generate_series(1, 100) as a(n)
), ranges as (
    SELECT *, (d.n::text || ' DAY')::interval as i,
           t1.date_start::date + ((d.n - 1)::text || ' DAY')::interval as new_start,
           t1.date_start::date + (d.n::text || ' DAY')::interval as new_end,

           CASE WHEN t1.date_start::date = t1.date_end::date AND d.n = 1
                THEN t1.date_start    
                WHEN t1.date_start::date < t1.date_end::date 
                THEN t1.date_start
                ELSE NULL
           END as date_start1,
           CASE WHEN t1.date_start::date = t1.date_end::date AND d.n = 1
                THEN t1.date_end
           END  date_end1
    FROM Table1 t1
    CROSS JOIN days d
)
SELECT *, CASE WHEN date_start < new_end AND date_end > new_start
               THEN 'overlap'
          END as overlap,
          CASE WHEN date_end1 IS NOT NULL 
               THEN date_start1
               WHEN date_start < new_end AND date_end > new_start
               THEN CASE WHEN date_start > new_start
                         THEN date_start
                         ELSE new_start
                    END
          END as final_start,

          CASE WHEN date_end1 IS NOT NULL 
               THEN date_end1
               WHEN date_start < new_end AND date_end > new_start
               THEN CASE WHEN date_end < new_end
                         THEN date_end
                         ELSE new_end - '1 second'::interval
                    END
          END as final_end                      
FROM ranges
WHERE date_start < new_end AND date_end > new_start
ORDER BY "id", new_start

OUTPUT

Juan, thanks for your answer! Unfortunately, Redshift doesn't support some Postgres functions, incl generate_series ( http://docs.aws.amazon.com/redshift/latest/dg/c_unsupported-postgresql-functions.html ) . :( — Denis, Dec 17 '17 at 20:46
You can still do it, but you need to pre-generate the "generate series" table. The `generate_series()` function will work, as long as it does not interact wit the data. Therefore, you can use it to generate static tables with which to join. — John Rotenstein, Dec 17 '17 at 22:20
please amend answer or mark it as not working with redshift? — Jon Scott, Dec 18 '17 at 08:32
@JonScott You mean me? If the problem is only `generate_series()` function you can also create the list of numbers with a query similar to [**this**](https://stackoverflow.com/questions/21425546/how-to-generate-a-range-of-numbers-between-two-numbers) — Juan Carlos Oropeza, Dec 18 '17 at 13:15

score 0 · Answer 2 · answered Jan 04 '18 at 19:40

Finally, I've done in this way. Works for 2-days continuous time ranges maximum (i.e. session started 2017-12-02, ended 2017-12-04 - will not be taken in this data set; 2017-12-02 -- 2017-12-03 is ok).

 -- Select 1-st day's interval for two-days sessions:
     SELECT sessions.date_start
            ,DATE_TRUNC('day',sessions.date_end) as date_end
       FROM sessions
      WHERE DATEDIFF(day,sessions.date_start,sessions.date_end) = 1


     UNION ALL
     -- Select 2-nd day's interval for two-days sessions:
     SELECT DATE_TRUNC('day',sessions.date_end) as date_start
            ,sessions.date_end as date_end
       FROM sessions
      WHERE DATEDIFF(day, sessions.date_start, sessions.date_end) = 1

    UNION ALL
    -- Select one-day sessions:
     SELECT sessions.date_start as date_start
            ,sessions.date_end as date_end
       FROM sessions
      WHERE DATEDIFF(day, sessions.date_start, sessions.date_end) = 0

SQL split range of dates in Redshift

2 Answers2