0

The Problem

I'm trying to detect and react to changes in a table where each update is being recorded as a new row with some values being the same as the original, some changed (the ones I want to detect) and some NULL values (not considered changed).

For example, given the following table MyData, and assuming the OrderNumber is the common value,

ID  OrderNumber   CustomerName   PartNumber  Qty   Price   OrderDate
1   123           Acme Corp.     WG301       4     15.02   2020-01-02
2   456           Base Inc.      AL337       7     20.15   2020-02-03
3   123           NULL           WG301b      5     19.57   2020-01-02

If I execute the query for OrderNumber = 123 I would like the following data returned:

Column      OldValue  NewValue
ID          1         3
PartNumber  WG301     WG301b
Qty         4         5
Price       15.02     19.57

Or possibly a single row result with only the changes filled, like this (however, I would strongly prefer the former format):

ID  OrderNumber   CustomerName   PartNumber  Qty   Price   OrderDate
3   NULL          NULL           WG301b      5     19.57   NULL

My Solution

I have not had a chance to test this, but I was considering writing the query with the following approach (pseudo-code):

select
  NewOrNull(last.ID, prev.ID) as ID,
  NewOrNull(last.OrderNumber, prev.OrderNumber) as OrderNumber
  NewOrNull(last.CustomerName, prev.CustomerName) as CustomerName,
  ...
  from last row with OrderNumber = 123
  join previous row where OrderNumber = 123

Where the function NewOrNull(lastVal, prevVal) returns NULL if the values are equal or lastVal value is NULL, otherwise the lastVal.

Why I'm Looking for an Answer

I'm afraid that the ugly join, the number of calls to the function, and the procedural approach may make this approach not scalable. Before I start down the rabbit hole, I was wondering...

The Question

...are there any other approaches I should try, or any best practices to solving this specific type of problem?

Jens Ehrich
  • 573
  • 1
  • 6
  • 19
  • 1
    This is fairly opinion based, but IMO 1) Nothing wrong with the join, 2) Thats how you have to compare 2 values for a change, and 3) Can't see anything procedural in your example- well if you restrict it to 1 OrderNumber then yes, but if you perform the operation on a range of OrderNumbers, such as all the records affected by an update statement, then its not. – Dale K Jul 02 '20 at 22:20
  • Can you tell us what the information will be used for? – Alex Jul 02 '20 at 23:40
  • @DaleK You're right, the join isn't nearly as ugly as I had imagined after considering how to do this in a set-based manner. – Jens Ehrich Jul 03 '20 at 02:20
  • @Alex not sure why it matters, but the info will be used to highlight changes to the application's users. – Jens Ehrich Jul 03 '20 at 02:20
  • You may try [unpivot and pivot](https://stackoverflow.com/a/56031632/4271117). The key is to transpose data so that you can compare rows like the way comparing columns. – Weihui Guo Jul 03 '20 at 19:31
  • @WeihuiGuo I am starting to look into pivot and esp. unpivot, but they're a little over my head at the moment. Would you be able to provide an example of how to use unpivot on the results from this answer: https://stackoverflow.com/a/62707633/506174? – Jens Ehrich Jul 04 '20 at 19:57
  • To use unpivot, all columns should be the same data type. Also, you'll need to use dynamic SQL to get the IDs. This is just [a start point](http://sqlfiddle.com/#!18/ac8b3c/3) if you are still interested in this method. – Weihui Guo Jul 04 '20 at 21:37

1 Answers1

1

I came up with a solution for the second (less preferred) format:

The Data

Using the following data:

INSERT INTO MyData
    ([ID], [OrderNumber], [CustomerName], [PartNumber], [Qty], [Price], [OrderDate])
VALUES
    (1, 123, 'Acme Corp.', 'WG301', '4', '15.02', '2020-01-02'),
    (2, 456, 'Base Inc.', 'AL337', '7', '20.15', '2020-02-03'),
    (3, 123, NULL, 'WG301b', '5', '19.57', '2020-01-02'),
    (4, 123, 'ACME Corp.', 'WG301b', NULL, NULL, '2020-01-02'),
    (6, 456, 'Base Inc.', NULL, '7', '20.15', '2020-02-05');

The Function

This function returns the updated value if it has changed, otherwise NULL:

CREATE FUNCTION dbo.NewOrNull
(
  @newValue sql_variant, 
  @oldValue sql_variant
)  
RETURNS sql_variant
AS   
BEGIN  
    DECLARE @ret sql_variant
    SELECT @ret = CASE 
      WHEN @newValue IS NULL THEN NULL
      WHEN @oldValue IS NULL THEN @newValue
      WHEN @newValue = @oldValue THEN NULL
      ELSE @newValue
      END
    RETURN @ret
END;

The Query

This query returns the history of changes for the given order number:

select dbo.NewOrNull(new.ID, old.ID) as ID,
       dbo.NewOrNull(new.OrderNumber, old.OrderNumber) as OrderNumber,
       dbo.NewOrNull(new.CustomerName, old.CustomerName) as CustomerName,
       dbo.NewOrNull(new.PartNumber, old.PartNumber) as PartNumber,
       dbo.NewOrNull(new.Qty, old.Qty) as Qty,
       dbo.NewOrNull(new.Price, old.Price) as Price,
       dbo.NewOrNull(new.OrderDate, old.OrderDate) as OrderDate
from MyData new
left join MyData old
  on old.ID = (
    select top 1 ID 
    from MyData pre
    where pre.OrderNumber = new.OrderNumber
      and pre.ID < new.ID
    order by pre.ID desc
    )
where new.OrderNumber = 123

The Result

ID  OrderNumber CustomerName    PartNumber  Qty Price   OrderDate
1   123 Acme Corp.  WG301   4   15.02   2020-01-02
3   (null)  (null)  WG301b  5   19.57   (null)
4   (null)  ACME Corp.  (null)  (null)  (null)  (null)

The Fiddle

Here's the SQL Fiddle that shows the whole thing in action.

http://sqlfiddle.com/#!18/b720f/5/0

Jens Ehrich
  • 573
  • 1
  • 6
  • 19