-1

I have a data table and a calendar table. I need to fill in the gaps to have a row for each day. I need the dates to be filled in as well as to use the previous valid value for sequence and action. NULL is a valid action type. I've gotten close but the NULLS have been screwing with my results.

I'm using SQL Server 2014. Your help is much appreciated.

Data:

ID | Sequence|                Date| Action|
---|---------|--------------------|-------|
A  |       1 |  4/5/2016 12:09:23 |  yield|
A  |       2 |  4/7/2016 12:25:16 |   stop|
A  |       3 |  4/12/2016 11:25:42|   NULL|
A  |       4 |  4/18/2016 11:25:42|     go|
B  |       1 |  2/17/2016 14:15:10|  yield|
B  |       2 |  3/1/2016 7:56:37  |   stop|
B  |       3 |  4/1/2016 9:24:46  |     go|
B  |       4 |  5/4/2016 12:25:16 |   exit|

Calendar:

DateKey |                Date|
--------|--------------------|
2838    |   4/6/2016 0:00:00 |
2839    |   4/7/2016 0:00:00 |
2840    |   4/8/2016 0:00:00 |
2841    |   4/9/2016 0:00:00 |
2842    |  4/10/2016 0:00:00 |
2843    |  4/11/2016 0:00:00 |
2844    |  4/12/2016 0:00:00 |
2845    |  4/13/2016 0:00:00 |
2846    |  4/14/2016 0:00:00 |
2847    |  4/15/2016 0:00:00 |
2848    |  4/16/2016 0:00:00 |
2849    |  4/17/2016 0:00:00 |
2850    |  4/18/2016 0:00:00 |

Desired results:

ID  | Sequence|                Date|    Action|
----|---------|--------------------|----------|
A   |        1|   4/5/2016 12:09:23|     yield|
A   |        1|    4/6/2016 0:00:00|     yield|
A   |        2|   4/7/2016 12:25:16|      stop|
A   |        2|    4/8/2016 0:00:00|      stop|
A   |        2|    4/9/2016 0:00:00|      stop|
A   |        2|   4/10/2016 0:00:00|      stop|
A   |        2|   4/11/2016 0:00:00|      stop|
A   |        3|  4/12/2016 10:35:34|      NULL|
A   |        3|   4/13/2016 0:00:00|      NULL|
A   |        3|   4/14/2016 0:00:00|      NULL|
A   |        3|   4/15/2016 0:00:00|      NULL|
A   |        3|   4/16/2016 0:00:00|      NULL|
A   |        3|   4/17/2016 0:00:00|      NULL|
A   |        4|   4/18/2016 11:25:4|        go|

TSQL to get test tables up:

IF EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME =     'Data') 
    DROP TABLE Data;

CREATE TABLE Data (ID varchar(20), SEQ INT, Date datetime, Action     varchar(20));

INSERT INTO data (ID, SEQ, Date, Action) VALUES
('A', 1, '4/5/2016 12:09:23','yield' ),
('A', 2, '4/7/2016 12:25:16','stop' ),
('A', 3, '4/12/2016 11:25:42','NULL' ),
('A', 4, '4/18/2016 11:25:42','go' )



IF EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME =     'Calendar')
    DROP TABLE Calendar;

CREATE TABLE Calendar (DateKey int, Date datetime);

INSERT INTO Calendar (DateKey, Date) VALUES
('2838','4/6/2016 0:00:00'),
('2839','4/7/2016 0:00:00' ),
('2840', '4/8/2016 0:00:00' ),
('2841', '4/9/2016 0:00:00' ),
('2842', '4/10/2016 0:00:00'),
('2843', '4/11/2016 0:00:00' ),
('2844', '4/12/2016 0:00:00' ),
('2845', '4/13/2016 0:00:00' ),
('2846', '4/14/2016 0:00:00' ),
('2847', '4/15/2016 0:00:00'),
('2848', '4/16/2016 0:00:00' ),
('2849', '4/17/2016 0:00:00' ),
('2850', '4/18/2016 0:00:00' )
dk13
  • 1
  • 3
  • I think it would be much easier to do this with code instead of SQL – CptMisery Dec 20 '16 at 18:24
  • Sorry got click happy with the tags...and I would code it if I was able... – dk13 Dec 20 '16 at 18:26
  • While I don't have the time to write the query for you, I'm going to point you in the correct direction: you can do something like select * from table1 t1 left join table2 t2 ON t1.dDate <= t2.dDate, which will join all rows from table 1 with everything that matches in table 2 where the date of table 2 is greater or equal to the date in tb1. In your case it's more complicated than this, but such a join condition is where you want to start from. – Mathieu Turcotte Dec 20 '16 at 19:10
  • I recommend reading through this. While it doesn't solve your problem entirely, it could help you build the main subquery you'll need. http://stackoverflow.com/questions/19432913/select-info-from-table-where-row-has-max-date – Mathieu Turcotte Dec 21 '16 at 13:06

1 Answers1

0

What about this SQL command?

SELECT A.[ID]
      ,A.[SEQ]
      ,ISNULL(A.[Date], B.[Date]) AS 'DATE'
      ,A.[Action]
FROM DATA A FULL JOIN CALENDAR B 
     ON cast(A.[Date] as DATE) = cast(B.[Date] as DATE)
ORDER BY B.[date] ASC

The result for the FULL JOIN option:

ID  | SEQ | DATE                    | Action|
----|-----|-------------------------|-------|
A   | 1   | 2016-04-05 12:09:23.000 |yield  |
NULL| NULL| 2016-04-06 00:00:00.000 |NULL   |
A   | 2   | 2016-04-07 12:25:16.000 |stop   |
NULL| NULL| 2016-04-08 00:00:00.000 |NULL   |
NULL| NULL| 2016-04-09 00:00:00.000 |NULL   |
NULL| NULL| 2016-04-10 00:00:00.000 |NULL   |
NULL| NULL| 2016-04-11 00:00:00.000 |NULL   |
A   | 3   | 2016-04-12 11:25:42.000 |NULL   |
NULL| NULL| 2016-04-13 00:00:00.000 |NULL   |
NULL| NULL| 2016-04-14 00:00:00.000 |NULL   |
NULL| NULL| 2016-04-15 00:00:00.000 |NULL   |
NULL| NULL| 2016-04-16 00:00:00.000 |NULL   |
NULL| NULL| 2016-04-17 00:00:00.000 |NULL   |
A   | 4   | 2016-04-18 11:25:42.000 |go     |
Eduardo Pelais
  • 1,627
  • 15
  • 21
  • Unfortunately this did not create a row for each day – dk13 Dec 20 '16 at 19:07
  • I think what he wanted to do is the opposite. FROM Calendar LEFT JOIN Data instead of FROM Data LEFT JOIN Calendar (or change the LEFT JOIN for a RIGHT JOIN, but I don'T recommend it. Right JOINS are stupid. They are inversed LEFT JOINS 99.99% of the time, like it would be in this situation). However, that'll leave days from calendar not matching with days from data with NULL values. It won't get the previous values. That is the easy part. getting the previous values for empty days is the harder part. Check my comment in your first post for a glimpse of the solution. – Mathieu Turcotte Dec 20 '16 at 19:16
  • I actually got the dates to fill in but as you mentioned the hard part was getting the previous values. Most solutions I've seen use a variation of 'top 1' / order by in conjunction with 'where action is not null' (this is where it got me). – dk13 Dec 20 '16 at 19:28
  • yeah it's hard for me to test it out for you. We have MySQL at work, there is a couple major diffrences in the syntax. I used to develop under MS SQL Server, but I won't write a query I can't even test lol. But like I said, I'm thinking about a join with a <= or >= condition, but then again, you do want to get the latest status from the Data table before the date in the Calendar table. This can turn into a rather complicated query, but it's absolutely feasable, I've done a couple of those in the past. These often turn into queries with a couple levels of subqueries. – Mathieu Turcotte Dec 21 '16 at 13:04
  • Thanks for the feedback guys. I improved the script using FULL JOIN and wrote the result of my tests. – Eduardo Pelais Dec 21 '16 at 18:43