0

Let's say I have a temporal table called ProductDetails that using below query return some historical data.

SELECT * FROM ProductDetails
FOR system_time
BETWEEN '1900-01-01 00:00:00' AND '9999-12-31 00:00:00'
WHERE ProductID = 8


ID    ProductID(FK)    Attribute    Value    SysStartTime           SysEndTime
--    -------------    ---------    -----    -------------------    ----------
1     8                Size         S        2020-07-06 05:00:00    9999-12-31 23:59:59
2     8                Color        Blue     2020-07-06 05:00:01    2020-07-09 11:11:11
2     8                Color        Green    2020-07-09 11:11:11    9999-12-31 23:59:59

This means when product with ID = 8 was created at 2020-07-06 05:00:00, 2 attributes were added, and then later one of records was edited to change from "Blue" to "Green". Notice that SysStartTime for second row has 1 second difference when they were saved.

Now I need to write a query to have below results. Basically, it is attribute values in different snapshots of time when changes occurred. Time is down to minute.

Start Time          End Time            Attributes Values
----------------    ----------------    -----------------
2020-07-06 05:00    2020-07-09 11:11    Size = S, Color = Blue
2020-07-09 11:11    NULL                Size = S, Color = Green

How can I achieve that? Each product might have different attributes, but the query is for one product at a time.

Delphi.Boy
  • 1,199
  • 4
  • 17
  • 38

1 Answers1

1

Below is a solution that formats your data in one query. Performance is not an issue with a small data set of 4 rows (I added a row to your example), but my guess is that this will not be fast for millions of records.

The solution provided here generates different data sets in the form of common table expressions (CTE) and uses some techniques from other StackOverflow answers to remove the seconds and concatenate the row values. Plus a cross apply at the end.

The approach can be described in steps that correspond with the consecutive CTE's / joins:

  1. Create a set of attributes for each product.
  2. Create a set of period start moments for each product (leaving out the seconds).
  3. Combine the attributes for each product with each period and look for the appropriate value.
  4. Use some XML functions to format the attributes values in a single row.
  5. Use cross apply to fetch the period end.

Full solution:

-- sample data
declare @data table
(
    ID              int,
    ProductId       int,
    Attribute       nvarchar(10),
    Value           nvarchar(10),
    SysStartTime    datetime2(0),
    SysEndTime      datetime2(0)
);

insert into @data (ID, ProductId, Attribute, Value, SysStartTime, SysEndTime) values
(1, 8, 'Size', 'S', '2020-07-06 05:00:00', '9999-12-31 23:59:59'),
(2, 8, 'Color', 'Blue', '2020-07-06 05:00:01', '2020-07-09 11:11:11'),
(2, 8, 'Color', 'Green', '2020-07-09 11:11:11', '9999-12-31 23:59:59'),
(2, 8, 'Weight', 'Light', '2020-07-10 10:11:12', '9999-12-31 23:59:59'); -- additional data to have extra attribute not available from start

-- solution
with prodAttrib as -- attributes per product
(
    select d.ProductId, d.Attribute
    from @data d
    group by d.ProductId, d.Attribute
),
prodPeriod as -- periods per product
(
    select  d.ProductId,
            dateadd(minute, datediff(minute, 0, d.SysStartTime), 0) as 'SysStartTimeNS' -- start time No Seconds
    from @data d
    group by ProductId, dateadd(minute, datediff(minute, 0, d.SysStartTime), 0)
),
prodResult as -- attribute value per period per product
(
    select  pp.ProductId,
            convert(nvarchar(16), pp.SysStartTimeNS, 120) as 'FromDateTime',
            convert(nvarchar(16), coalesce(pe.SysEndTime, '9999-12-31 23:59:59'), 120) as 'ToDateTime',
            pa.Attribute,
            av.Value
    from prodPeriod pp
    join prodAttrib pa
        on  pa.ProductId = pp.ProductId
    outer apply (   select top 1 d.Value
                    from @data d
                    where d.ProductId = pp.ProductId
                      and d.Attribute = pa.Attribute
                      and dateadd(minute, datediff(minute, 0, d.SysStartTime), 0) <= pp.SysStartTimeNS
                    order by d.SysStartTime desc ) av -- attribute values per product
    outer apply (   select top 1 dateadd(second, -1, d.SysStartTime) as 'SysEndTime'
                    from @data d
                    where d.ProductId = pp.ProductId
                      and dateadd(minute, datediff(minute, 0, d.SysStartTime), 0) > pp.SysStartTimeNS
                    order by d.SysStartTime ) pe -- period end
),
prodResultFormat as -- concatenate attribute values per period
(
    select  pp.ProductId,
            convert(nvarchar(16), pp.SysStartTimeNS, 120) as 'FromDateTime',
            (
                select pr.Attribute + ' = ' + coalesce(pr.Value,'') + ', ' as [text()]
                from prodResult pr
                where pr.ProductId = pp.ProductId
                  and pr.FromDateTime = convert(nvarchar(16), pp.SysStartTimeNS, 120)
                order by pr.Attribute
                for xml path('')
            ) as 'Attributes'
    from prodPeriod pp
)
select  prf.ProductId,
        prf.FromDateTime,
        x.ToDateTime,
        left(prf.Attributes, len(prf.Attributes)-1) as 'Attributes'
from prodResultFormat prf
cross apply (   select top 1 pr.ToDateTime
                from prodResult pr
                where pr.ProductId = prf.ProductId
                  and pr.FromDateTime = prf.FromDateTime ) x
order by prf.ProductId, prf.FromDateTime;

Result for extended example data:

ProductId    FromDateTime      ToDateTime        Attributes
-----------  ----------------  ----------------  ----------------------------------------
8            2020-07-06 05:00  2020-07-09 11:11  Color = Blue, Size = S, Weight = 
8            2020-07-09 11:11  2020-07-10 10:11  Color = Green, Size = S, Weight = 
8            2020-07-10 10:11  9999-12-31 23:59  Color = Green, Size = S, Weight = Light

P.S. replace x.EndDateTime with case when x.ToDateTime = '9999-12-31 23:59' then NULL else x.ToDateTime end as 'ToDateTime' if you really need the NULL values.

Sander
  • 3,942
  • 2
  • 17
  • 22
  • 1
    Thanks for the answer. ToDateTime on second row should come up as 2020-07-10 10:10, otherwise the answer is not correct. I was hoping to see features of querying temporal tables such as FOR SYSTEM_TIME AS OF, etc. My current solution involves creating dynamic SQL (UNIONing STRING_AGGed snapshots of attributes in each AS OF result). – Delphi.Boy Jul 16 '20 at 18:10
  • @Delphi.Boy Fixed the end result with a second OUTER APPLY in the prodResult CTE. Did not take the AS OF in consideration (will require setting up temporal tables on my test environment), but perhaps you can merge your solution into mine. The first 2 CTE's provide the ProductId, Attribute and SysStartTimeNS which can serve as WHERE clause parameters for a new query with AS OF in the CTE prodResult. Little or no changes required after that. – Sander Jul 17 '20 at 06:43