2

I have a basic 2-column table of data containing names (VARCHAR) and dates (DATE), the results of a daily scanning process. The names will generally repeat across multiple dates but they may differ from day to day. No name/date combination can repeat (composite primary key). I'm using SQL Server 2012.

MyName | MyDate
------ | ----------
ABC    | 2017-04-11
DEF    | 2017-04-11
GHJ    | 2017-04-11
ABC    | 2017-04-10
DEF    | 2017-04-10
GHJ    | 2017-04-10
ABC    | 2017-04-08
GHJ    | 2017-04-08

I need to create a view that will pull back this information from the table between two defined dates and, if there are no records for any given date in the range, generate records for that date based on the previous date that was available.

So using the example above, for a date range of 2017-04-08 to 2017-04-11, there are no records for 2017-04-09 - so I want to duplicate the records for 2017-04-08 (the previous date for which records are available) and include them in the returned dataset, á la :

  MyName | MyDate
-------- | ----------
  ABC    | 2017-04-11
  DEF    | 2017-04-11
  GHJ    | 2017-04-11
  ABC    | 2017-04-10
  DEF    | 2017-04-10
  GHJ    | 2017-04-10
**ABC    | 2017-04-09**
**GHJ    | 2017-04-09**
  ABC    | 2017-04-08
  GHJ    | 2017-04-08

I want to do this as a view as opposed to incorporating a "gap-filling" process as part of the scan / import to the table, so that I can easily identify, retrospectively, the dates for which a scan may have failed (but still be able to produce a complete dataset with every date accounted for, even if the gaps are only fudged)

I've trawled the site looking for a solution but the only ones I've found seem to accommodate tables where there is only one record per date, and filling that 'single' gap accordingly (usually with a zero rather than adopting a previous value)

I can generate a temp table of all dates in the defined range and use an outer join to add the true data from the table, but how can I plug the gaps, given that each date can have multiple records? And I only consider a gap to be where there are no records for a given date?

BeanFrog
  • 2,297
  • 12
  • 26
Alan O'Brien
  • 151
  • 1
  • 13
  • Correct me if I am wrong, I think it is not possible to have parameterized views. So you will need to go with a table-valued function. – Giorgos Altanis Apr 11 '17 at 12:08
  • I'll bow to your greater knowledge on whether or not parameterized views are possible; the date range could be hard-coded into the view to get around this (e.g. the last 6 months from today, or between the MIN and MAX dates available in the table, for example?) – Alan O'Brien Apr 11 '17 at 12:10

1 Answers1

0

You can use a table-valued function, which will allow you to use parameters, so there is no need to hard-code them.

The table returned includes an extra column Ins; it is 0 for the original data and 1 for the inserted.

Rextester demo

If you prefer a view you can easily extract the logic.

create function myFun(@start date, @end date) 
returns @result table (MyName varchar(10), MyDate date, Ins bit) as
begin
;
    with dates as (
        SELECT  DATEADD(DAY, nbr - 1, @start) dt
        FROM    ( SELECT    ROW_NUMBER() OVER ( ORDER BY c.object_id ) AS Nbr
                  FROM      sys.columns c
                ) nbrs
        WHERE   nbr - 1 <= DATEDIFF(DAY, @start, @end)
    ), last_avail as (
        select dates.dt, max(t.MyDate) prev_dt
        from dates join t on dates.dt >= t.MyDate
        group by dates.dt
    ), empty as (
        select * from last_avail where dt <> prev_dt
    )
    insert @result (MyName, MyDate, Ins)
    select MyName, MyDate, 0 from t
    union all
    select t.MyName, x.dt, 1
    from empty x join t on x.prev_dt = t.MyDate;

    return;
end

Usage example:

declare @start date, @end date;

set @start = '2017-04-08';
set @end = '2017-04-11';
select * from dbo.myFun(@start, @end) order by MyDate desc, MyName asc;

Attribution

The Common Table Expression which plays the role of a calendar table has been found here.

Community
  • 1
  • 1
Giorgos Altanis
  • 2,742
  • 1
  • 13
  • 14
  • Thanks for this, this is brilliant! However... Does this not assume that there will only ever be a single date gap between records? So if there is a gap of more than one consecutive date (e.g. records present for Apr 1st and Apr 8th and nothing in between), it will do the necessary for Apr 2nd (taking the value for Apr 1st), but Apr 3rd through Apr 7th will still be absent from the returned dataset (no directly previous day to use) ? – Alan O'Brien Apr 11 '17 at 13:00
  • I am afraid you are correct. Let me work on it for a while. – Giorgos Altanis Apr 11 '17 at 13:09
  • I think I corrected it; I introduced a new CTE `last_avail` which gives, for each date in the chosen interval, the max date for which data exist in the table. – Giorgos Altanis Apr 11 '17 at 13:24
  • Fantastic! Yes, this works a charm. Thank you so much Giorgos! – Alan O'Brien Apr 11 '17 at 14:16
  • Great, take care! – Giorgos Altanis Apr 11 '17 at 14:21