1

Consider this schema

CREATE TABLE [PrimaryTable]
    [Id] int IDENTITY(1,1) NOT NULL

ALTER TABLE [PrimaryTable]
    ADD CONSTRAINT [PK_PrimaryTable]
    PRIMARY KEY ([Id])

CREATE TABLE [SecondaryTable]
    [PrimaryTableId] int NOT NULL
    [Name] nvarchar(4000) NOT NULL
    [Value] nvarchar(4000) NULL

ALTER TABLE [SecondaryTable]
    ADD CONSTRAINT [PK_SecondaryTable] 
    PRIMARY KEY ([PrimaryTableId],[Name])

And then the following data

PrimaryTable
| Id |
|  1 |
|  2 |
|  3 |

SecondaryTable
| PrimaryTableId | Name    | Value    |
|              1 |     xxx |      yyy |
|              2 |     xxx |      zzz |

I am attempting to write a query which will give me all the entries in PrimaryTable that DO NOT have a name/value of xxx=yyy, including those where there is no entry in SecondaryTable for xxx

Currently I have the following which only returns ID = 2, and not ID = 3

SELECT Id FROM PrimaryTable
LEFT OUTER JOIN SecondaryTable ON PrimaryTable.Id = SecondaryTable.PrimaryTableId
WHERE (SecondaryTable.Name = 'xxx' AND SecondaryTable.Value NOT LIKE 'yyy')

Describing the additional clause in plain English would be something along the lines of ...OR SecondaryTable.Name = 'xxx' does not exist

Edit I should note that I've simplified both the table structure and the query for this question - other columns from PrimaryTable will also be retrieved (as well as form part of the query), and there are additional queries on SecondaryTable using different name/value combinations, and different operators (=, !=, LIKE, NOT LIKE)

(Environment is SQL Server LocalDb 2014)

mecsco
  • 2,250
  • 1
  • 15
  • 28

4 Answers4

1

I simply added a null check on the joined table so the value would be included.

SELECT Id
FROM PrimaryTable
LEFT OUTER JOIN SecondaryTable ON PrimaryTable.Id = SecondaryTable.PrimaryTableId
WHERE (
   SecondaryTable.Name = 'xxx'
   AND SecondaryTable.Value NOT LIKE 'yyy'
)
OR SecondaryTable.PrimaryTableId IS NULL
UnhandledExcepSean
  • 12,504
  • 2
  • 35
  • 51
  • Your suggestion was promising at first, but once I applied it to some of my more complex scenarios (multiple name/value pairs, including some null values) it unfortunately doesn't work due to the limitations of my join/where statements. Changing to use a NOT EXISTS (as suggested in the accepted answer) instead of the JOIN appears to meet my needs – mecsco Jun 10 '15 at 12:28
1

Using NOT IN

SELECT  Id
FROM    PrimaryTable
WHERE   Id NOT IN (
                SELECT PrimaryTableId 
                FROM SecondaryTable 
                WHERE Name='xxx' AND Value='yyy')

Using NOT EXISTS

SELECT  pt.Id
FROM    PrimaryTable pt
WHERE   NOT EXISTS (
                    SELECT  *
                    FROM    SecondaryTable st
                    WHERE   pt.Id = pt.Id
                            AND (Name='xxx' AND Value='yyy')
JamieD77
  • 13,796
  • 1
  • 17
  • 27
  • 1
    Whenever possible, convert `NOT IN` clauses to `NOT EXISTS` clauses. `NOT IN` can have issues. See [this post](http://www.sql-server-performance.com/2012/sql-server-t-sql-tuning-not-in-and-not-exists/) about performance issues and [this post](http://sqlperformance.com/2012/12/t-sql-queries/left-anti-semi-join) about nullable column interactions. – DeadZone Jun 09 '15 at 17:30
  • I agree that null values can cause issues with NOT IN. Just include PrimaryTableId IS NOT NULL in the Where clause if it's a nullable column. But I dont see anything in the first article that shows any benchmarks to show any performance increase either way. Maybe you have another article with benchmarks? – JamieD77 Jun 09 '15 at 17:44
  • I don't have any benchmarks, but have personal experience where NOT IN took significantly (orders of magnitude) longer than not exists. This is because of the differences between how the two are evaluated. If you want more detailed examples with links, try looking at [this SO question](http://stackoverflow.com/questions/173041/not-in-vs-not-exists). It seems pretty thorough. – DeadZone Jun 09 '15 at 17:57
  • Does this apply when checking against a value directly (rather than an additional SELECT)? e.g. one part of the query I omitted is a nullable column in `PrimaryTable` - currently I do the following to exclude if it matches abc/def, but include if null: `WHERE PrimaryTable.Msg NOT IN ('abc', 'def') OR PrimaryTable.Msg IS NULL` – mecsco Jun 10 '15 at 09:23
  • Replacing the JOIN with a NOT EXISTS as you suggest appears to meet my needs for both the simple scenario presented in the question, and the more complex scenarios that this will be applied to – mecsco Jun 10 '15 at 12:29
0

Try:

SELECT Id 
FROM 
    PrimaryTable 
    LEFT OUTER JOIN SecondaryTable ON 
       PrimaryTable.Id = SecondaryTable.PrimaryTableId 
       and (SecondaryTable.Name = 'xxx' 
            AND SecondaryTable.Value NOT LIKE 'yyy')
where
   SecondaryTable.PrimaryTableId  is null

Thus moving the data filter clause onto the left join and with null check in where the list works like not exists in secondary table.

the alternative using the not exists is:

select s.PrimaryTableId as Id
from
   SecondaryTable s
where 
   not exists (select id from PrimaryTable where id = s.PrimaryTableId)
Consult Yarla
  • 1,150
  • 10
  • 22
0

Try ...

SELECT P.* FROM PrimaryTable AS P
WHERE P.PrimaryTableID NOT IN
(SELECT PrimaryTableID FROM SecondaryTable)
OR
P.PrimaryTableID IN
(SELECT PrimaryTableID FROM SecondaryTable
 WHERE Name<>'xxx' AND Value<>'yyy')
JohnH
  • 1,920
  • 4
  • 25
  • 32