0

I need to update a table if an existing record has changes. If the record exists (APK is found), then update it if ID_NUMBER has changed. The problem is when the existing TARGET value is NOT NULL and the SOURCE value IS NULL. How can I detect that condition as unequal?

Using ISNULL() can work, but the second parameter must -not- ever occur in the data. That requires profiling all of the NUMERIC data. Can it bedone without that? In this case, zero (0) can work only if it never occurs in the data.

UPDATE T SET
    T.f1 = S.f1
FROM TARGET_TABLE T
    INNER JOIN SOURCE_TABLE S
    ON T.APK = S.APK
WHERE
    ISNULL(T.ID_NUMBER,0) <> ISNULL(S.ID_NUMBER,0)
;

Here is the possible combinations of ID_NUMBER values in TARGET and SOURCE tables.

Target      Source
======      ======
NULL        NULL
NULL        7       -- should identify as unequal
7           NULL    -- should identify as unequal, BUT DOES NOT
7           7

The following script shows the results. Only the third statement, comparing an existing TARGET value of 7 with the incoming SOURCE value of NULL will fail. Why is that? What code will work?

SET NOCOUNT ON;
SELECT 3 WHERE ISNULL(NULL, NULL+1) <> ISNULL(NULL,NULL+1);
GO
SELECT 3 WHERE ISNULL(NULL, 7+1) <> ISNULL(7,7+1);
GO
SELECT 3 WHERE ISNULL(7, NULL+1) <> ISNULL(NULL,NULL+1); -- WHY DOES THIS NOT SEE THE INEQUALITY?
GO
SELECT 3 WHERE ISNULL(7, 7+1) <> ISNULL(7,7+1);
GO

Example execution:

1> SET NOCOUNT ON;
2> SELECT 3 WHERE ISNULL(NULL, NULL+1) <> ISNULL(NULL,NULL+1);
3> GO

-----------
1> SELECT 3 WHERE ISNULL(NULL, 7+1) <> ISNULL(7,7+1);
2> GO

-----------
          3
1> SELECT 3 WHERE ISNULL(7, NULL+1) <> ISNULL(NULL,NULL+1); -- WHY DOES THIS NOT SEE THE INEQUALITY?
2> GO

-----------
1> SELECT 3 WHERE ISNULL(7, 7+1) <> ISNULL(7,7+1);
2> GO

-----------
1>
lit
  • 14,456
  • 10
  • 65
  • 119

2 Answers2

2

I highly recommend doing some reading on NULL because it represents an unknown value and as such cannot be compared or added to another value. Therefore you have to treat it as a separate case using traditional AND/OR logic.

DECLARE @Table1 TABLE (APK int, ID_NUMBER int);
DECLARE @Table2 TABLE (APK int, ID_NUMBER int);

INSERT INTO @Table1 (APK, ID_NUMBER)
  VALUES (1, null), (1, null), (1, 7), (1, 7), (1, 5);

INSERT INTO @Table2 (APK, ID_NUMBER)
  VALUES (1, null), (1, 7), (1, null), (1, 7), (1, 4);

SELECT T.APK, T.ID_NUMBER, S.ID_NUMBER
FROM @Table1 T
INNER JOIN @Table2 S ON T.APK = S.APK
WHERE T.ID_Number <> S.ID_NUMBER
    OR (T.ID_Number IS NULL AND S.ID_NUMBER IS NOT NULL)
    OR (T.ID_Number IS NOT NULL AND S.ID_NUMBER IS NULL);

Given I suspect you have simplified your actual use-case you might find that EXCEPT can be used in your situation as EXCEPT (and INTERSECT) perform a different type of compare when it comes to NULLs. See here for more.

Dale K
  • 25,246
  • 15
  • 42
  • 71
  • 1
    Thank you for your answer. I used the WHERE clause as you provided. The linked article was helpful. I could not use EXCEPT because the table being updated has some additional control columns such as UPDATE_DT and others. INTERSECT looks equally strong, but since the tables do not have an equal number of columns, I would need to list them manually.. The WHERE clause is easily produced from a simple list of column names using a regex. Thanks. – lit Feb 10 '22 at 04:08
-1

Please try the following solution.

It is based on use of a checksum in CTEs via HASHBYTES() function. This method is working with NULL values and multiple columns in the tables.

I added UpdatedOn column to show what column was updated.

SQL

-- DDL and sample data population, start
DECLARE @Table1 TABLE (APK INT IDENTITY PRIMARY KEY, ID_NUMBER INT, UpdatedOn DATETIMEOFFSET(3));
DECLARE @Table2 TABLE (APK INT IDENTITY PRIMARY KEY, ID_NUMBER INT, UpdatedOn DATETIMEOFFSET(3));

INSERT INTO @Table1 (ID_NUMBER)
  VALUES (null), (null), (7), (7), (5);

INSERT INTO @Table2 (ID_NUMBER)
  VALUES (null), (7), (null), (7), (4);
-- DDL and sample data population, end

WITH source AS
(
    SELECT sp.*, HASHBYTES('sha2_256', xmlcol) as [Checksum] 
    FROM @Table1 sp
    CROSS APPLY (SELECT sp.* FOR XML RAW) x(xmlcol)
), target AS
(
    SELECT sp.*, HASHBYTES('sha2_256', xmlcol) as [Checksum] 
    FROM @Table2 sp
    CROSS APPLY (SELECT sp.* FOR XML RAW) x(xmlcol)
)
UPDATE T 
SET T.ID_NUMBER = S.ID_NUMBER
    , T.UpdatedOn = SYSDATETIMEOFFSET()
FROM TARGET AS T
    INNER JOIN SOURCE AS S
        ON T.APK = S.APK
WHERE T.[Checksum] <> S.[Checksum];

-- test
SELECT * FROM @Table2;

Output

+-----+-----------+--------------------------------+
| APK | ID_NUMBER |           UpdatedOn            |
+-----+-----------+--------------------------------+
|   1 | NULL      | NULL                           |
|   2 | NULL      | 2022-02-09 18:58:10.336 -05:00 |
|   3 | 7         | 2022-02-09 18:58:10.336 -05:00 |
|   4 | 7         | NULL                           |
|   5 | 5         | 2022-02-09 18:58:10.336 -05:00 |
+-----+-----------+--------------------------------+
Yitzhak Khabinsky
  • 18,471
  • 2
  • 15
  • 21