1

I have records in a SQL database that have a startDate and endDate that I need to expand.

| userName    | startDate  | endDate    | weekDay |  
| :---------: | :--------: | :--------: | :-----: |  
| Test User 1 | 2011-03-30 | 2011-04-05 | 1       |  
| Test User 2 | 2016-10-05 | 2016-10-07 | 5       |  
| Test User 3 | 2018-05-22 | 2018-05-26 | 4       |  

In the table above, each record has information that covers more than one date. What I need is one record per one date per user. An example of what I'm looking for:

| userName    | startDate  | weekDay    |
| :---------: | :--------: | :--------: |
| Test User 1 | 2011-03-30 | 1          |
| Test User 1 | 2011-03-31 | 1          |
| Test User 1 | 2011-04-01 | 1          |
| Test User 1 | 2011-04-02 | 1          |
| Test User 1 | 2011-04-03 | 1          |
| Test User 1 | 2011-04-04 | 1          |
| Test User 1 | 2011-04-05 | 1          |
| Test User 2 | 2016-10-05 | 5          |
| Test User 2 | 2016-10-06 | 5          |
| Test User 2 | 2016-10-07 | 5          |
| Test User 3 | 2018-05-22 | 4          |
| Test User 3 | 2018-05-23 | 4          |
| Test User 3 | 2018-05-24 | 4          |
| Test User 3 | 2018-05-25 | 4          |
| Test User 3 | 2018-05-26 | 4          |

This answer has gotten me a step closer, specifying how to generate a sequence of dates in SQL. How can I duplicate tabular records according to start and end dates in SQL?

As a note, I need this solution to work in both MSSQL and PostgreSQL.

shrek
  • 887
  • 6
  • 12
tsouchlarakis
  • 1,499
  • 3
  • 23
  • 44
  • You might want to take a look at this question for MS SQL: [Custom SQL Calendar](https://stackoverflow.com/questions/28266972/custom-sql-calendar) – Hilarion May 16 '18 at 22:51
  • And this: [SQL, Calendar and Primary key](https://stackoverflow.com/questions/34397557/sql-calendar-and-primary-key) – Hilarion May 16 '18 at 22:53
  • Thank you, those are helpful. The `cross join` in the second question you linked is interesting, and it could work here, but then again each row will need to be cross joined with a different set of dates. So I think I understand the logic of it, but where I'm stumped is that each row needs to be duplicated a different number of times, and for different dates... The only way I can think of going about it is looping through each record, defining a `startDate` and `endDate`, then joining with a date sequence table, but I also think there must be a better/more efficient way to do it.. – tsouchlarakis May 16 '18 at 23:03
  • 2
    Assuming you have a calendar table listing all relevant dates -- which you probably should anyway, -- this is a simple `PERIODS_TABLE p JOIN CALENDAR c ON c.date BETWEEN p.startDate AND p.endDate` – Nickolay May 16 '18 at 23:21
  • @tsouchlarakis: Yes, the cross join will not give you the exact count of duplicates, as you need. It will give the same for each one, and in each case, much more row duplicates, than you need. This is not an issue though, as you just filter out the duplicates you do not need, by adding a condition, which will test if the calendar table date is between your start and end date (like Nockolay has shown). Introducing loops may be more efficient in some cases, but usually is much worse, than using the calendar table. That's why calendar tables are so common. – Hilarion Jun 13 '18 at 18:18

2 Answers2

2

You can use a recursive CTE in both SQL Server and Postgres, but the syntax is slightly different. And, there is a simpler method in Postgres. So, in SQL Server, you can do:

with cte as (
      select username, startdate, weekday, enddate
      from t
      union all
      select username, dateadd(day, 1, startdate) weekday, enddate
      from cte
      where startdate < enddate
     )
select username, startdate, weekday
from cte
order by username, startdate;

You can adjust the date arithmetic and add the recursive keyword for Postgres.

The simpler method in Postgres is a lateral join:

select t.username, g.startdate, t.weekday
from t, lateral
     generate_series(start_date, end_date, interval '1 day') g(startdate);

If you need the same code to work in both, you need to generate a numbers table. Here is one (unpleasant) method:

with digits as (
      select v.n
      from (values (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) v(n)
     ),
     n as (
      select d1.n * 100 + d2.n * 10 + d3.n as n
      from digits d1 cross join digits d2 cross join digits d3
     )
select t.username, t.startdate + n.n, t.weekday
from t join
     n 
     on t.startdate + n.n <= t.enddate;

Note that for this to work startdate needs to be a datetime in SQL Server, but a date in Postgres.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Thank you for your answer. However I can't get this to work in MSSQL. I had to change a couple things to get it to run, and when it did, it looks like it's just giving me the same output as `SELECT username, startdate, weekday FROM t` would give me. I had to cast `startdate` as `datetime` in both subqueries inside the `WITH` like you said to avoid an `invalid type` error. I also had to change `dateadd(day, 1, startdate) weekday` to `dateadd(day, 1, startdate) as startdate, weekday` because the UNION was giving me problems with both sides not having the same columns.Unsure where I'm going wrong – tsouchlarakis May 23 '18 at 17:33
  • As a follow-up, I just got it to work for MSSQL! Thank you! Just needed to make those changes I mentioned my above comment. – tsouchlarakis May 23 '18 at 18:04
0

try below code. I used recursive common table expression.

;with cte
AS
(
  SELECT userName,startDate,startDate AS endDate,weekDay FROM tab1
    Union all
  SELECT t1.userName,DATEADD(d,1,t1.startdate) AS startDate,
  DATEADD(d,1,t1.startdate) AS startDate,t1.weekDay
  FROM cte t1
 JOIN tab1 t2 on t1.userName=t2.userName
 WHERE t2.endDate>t1.endDate
) 

 Select userName,startDate,weekDay from cte order by userName

SQL Fiddle: http://sqlfiddle.com/#!18/fa22a/3

Sahi
  • 1,454
  • 1
  • 13
  • 32