3

Doing a comparison of column pairs, trying to find any that don't match

SELECT
    RecID,
    Field1,
    Field2
FROM
    TestTable
WHERE
    Field1 != Field2

However I want to include Nulls in my check. Null in both columns is valid, however Null in one column and not in the other is invalid so needs to be included in the output

SELECT
    RecID,
    Field1,
    Field2
FROM
    TestTable
WHERE
    (Field1 != Field2) 
    OR (Field1 IS NULL AND Field2 IS NOT NULL) 
    OR (Field1 IS NOT NULL AND Field2 IS NULL) 

Is this the best way to write this, or is there a tidier/better way to do Null/Not Null comparison?

Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
Midavalo
  • 469
  • 2
  • 18
  • 29
  • This looks good to me, what is the issue of doing it this way ? – JonH Jan 20 '16 at 20:35
  • @JonH There may not be any issue doing it this way. I knew this way worked, but wanted to know if there was a 'proper' way to do it. If there is an issue for me doing it this way it's just that my example above is quite simplified - my actual comparison query is quite a lot bigger (many more columns) so if there was a better/quicker/shorter way to do the comparison it may have made my query easier to write. – Midavalo Jan 20 '16 at 21:29
  • 2
    This IS the proper way. – JonH Jan 20 '16 at 21:30

2 Answers2

2

I thought that you can write the filter condition in a bit more condensed way:

SELECT
    ID,
    Field1,
    Field2
FROM
    TestTable
WHERE
    NOT((Field1 = Field2) OR (Field1 IS NULL AND Field2 IS NULL))
;

But, this query will produce incorrect result. Your variant is correct.


The Connect item about proposed IS DISTINCT FROM operator mentioned by @Heinzi and this answer have a link to a very good post by Paul White: Undocumented Query Plans: Equality Comparisons.

In that post Paul explains that query processor already has this operator, which is used for INTERSECT queries.

Your example can be rewritten as:

SELECT
    ID,
    Field1,
    Field2
FROM
    TestTable AS T
WHERE
    NOT EXISTS
    (
        SELECT T.Field1

        INTERSECT

        SELECT T.Field2
    )
;
  • It produces correct results
  • It has a good execution plan
  • In this form it is easy to add more fields for comparison
Community
  • 1
  • 1
Vladimir Baranov
  • 31,799
  • 5
  • 53
  • 90
  • I tried the same as you (inverting the condition in the Connect entry), but, as you have realized, it won't work. The reason for that is that [tertium non datur](https://en.wikipedia.org/wiki/Law_of_excluded_middle) does not apply in SQL's [ternary logic](https://en.wikipedia.org/wiki/Three-valued_logic#Logics): Both `WHERE NULL = NULL` and `WHERE NULL != NULL` yield NULL and, thus, no records. – Heinzi Jan 21 '16 at 07:29
  • Using `WHERE NOT EXISTS (SELECT Field1 INTERSECT SELECT Field2)` gives me the same result as `WHERE (Field1 != Field2) OR (Field1 IS NULL AND Field2 IS NOT NULL) OR (Field1 IS NOT NULL AND Field2 IS NULL)` but much shorter, tidier and easy to read. Thanks @Vladimir – Midavalo Jan 21 '16 at 19:28
1

In future versions of SQL Server, there will - hopefully - be a IS DISTINCT FROM operator. Go ahead and vote for it, if you like.

Until then, your only option is to work around it, as you already do. There are other ways to write it - another common variant is (a <> b OR a IS NULL OR b IS NULL) AND NOT (a IS NULL AND b IS NULL) - but I consider the solution that you already have to be more readable.

Community
  • 1
  • 1
Heinzi
  • 167,459
  • 57
  • 363
  • 519
  • 1
    The connect item and the answer that your quoted has a link to Paul White's post how `IS DISTINCT FROM` can be achieved using a trick with `INTERSECT`. – Vladimir Baranov Jan 20 '16 at 23:55