465

I'm doing some maintenance work on a database application and I've discovered that, joy of joys, even though values from one table are being used in the style of foreign keys, there's no foreign key constraints on the tables.

I'm trying to add FK constraints on these columns, but I'm finding that, because there's already a whole load of bad data in the tables from previous errors which have been naively corrected, I need to find the rows which don't match up to the other table and then delete them.

I've found some examples of this kind of query on the web, but they all seem to provide examples rather than explanations, and I don't understand why they work.

Can someone explain to me how to construct a query which returns all the rows with no matches in another table, and what it's doing, so that I can make these queries myself, rather than coming running to SO for every table in this mess that has no FK constraints?

Kara
  • 6,115
  • 16
  • 50
  • 57
Frosty840
  • 7,965
  • 12
  • 50
  • 86

11 Answers11

839

Here's a simple query:

SELECT t1.ID
FROM Table1 t1
    LEFT JOIN Table2 t2 ON t1.ID = t2.ID
WHERE t2.ID IS NULL

The key points are:

  1. LEFT JOIN is used; this will return ALL rows from Table1, regardless of whether or not there is a matching row in Table2.

  2. The WHERE t2.ID IS NULL clause; this will restrict the results returned to only those rows where the ID returned from Table2 is null - in other words there is NO record in Table2 for that particular ID from Table1. Table2.ID will be returned as NULL for all records from Table1 where the ID is not matched in Table2.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
AdaTheDev
  • 142,592
  • 28
  • 206
  • 200
  • 10
    Fails if an ID is NULL – Michael Aug 13 '13 at 03:53
  • 269
    @Michael - If having a `NULL` ID is valid in your schema, you might have bigger problems, wouldn't you agree? :) – rinogo Nov 07 '13 at 04:44
  • 1
    will this work even if table1 has more records then table2? if table1 has 100 records and table2 has 200 records (100 that match/join and 100 that don't match/join) would we get all 200 records returned? – Juan Velez Aug 08 '16 at 20:15
  • 1
    I often like to wrap the left join as a subquery/inline view in order to ensure there is no interplay between the WHERE clause and the LEFT JOIN. – Andrew Wolfe Feb 07 '17 at 00:47
  • 1
    This is much slower than `exists` when there's more than one row in Table2 for each row in Table 1. https://stackoverflow.com/a/36694478/37923 is a better answer – Mark Mar 19 '18 at 18:18
  • 2
    I find it very strange that we first say: `t1.ID = t2.ID` which means both id in both tables should be equal and then we say oh but the id on the other table should be null `t2.ID IS NULL` doesn't that second statement means that t1.ID != t2.ID which is contradictory to the first statement? – Jas Aug 24 '18 at 07:48
  • 1
    @jas t1.ID = t2.ID is the join. t2.ID is Null is the filter. – Cool Breeze Jan 11 '19 at 05:16
  • 3
    @Jas Key point 1 of the answer, ALL rows from first table, even those not matching t1.ID = t2.ID condition of left join. If you change first line to `SELECT t1.ID, t2.ID` and remove WHERE line you will get better idea how this works. –  Aug 19 '19 at 12:19
190

I would use EXISTS expression since it is more powerful, you can e.g. more precisely choose rows you would like to join. In the case of LEFT JOIN, you have to take everything that's in the joined table. Its efficiency is probably the same as in the case of LEFT JOIN with null constraint.

SELECT t1.ID
FROM Table1 t1
WHERE NOT EXISTS (SELECT t2.ID FROM Table2 t2 WHERE t1.ID = t2.ID)
Ondrej Bozek
  • 10,987
  • 7
  • 54
  • 70
  • Something this simple is easily handled by the query optimizer for best execution. – Andrew Wolfe Feb 07 '17 at 00:44
  • 2
    Yes, main advantage of `EXISTS` is its variability. – Ondrej Bozek Feb 07 '17 at 12:31
  • is it supposed to work in mysql ? I keep getting #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near – Pavel Niedoba Mar 21 '18 at 13:46
  • It shoudl work for MySQL see docs: https://dev.mysql.com/doc/refman/5.5/en/exists-and-not-exists-subqueries.html – Ondrej Bozek Mar 26 '18 at 10:03
  • 6
    Actually reduced the speed of one query i had from 7 sec to 200ms... (compared to `WHERE t2.id IS NULL` ) Thank You. – Moti Korets Feb 12 '19 at 20:56
  • 15
    @MotiKorets you mean increased the speed :) – Ondrej Bozek Apr 23 '19 at 14:01
  • 1
    The other advantage of this method is if working in phpMyAdmin, it potentially gives a unique key to the line so you get the Edit, Copy, Delete options in the visual interface as opposed to using a join where you may not get those options. – Scott May 25 '20 at 10:53
23
SELECT id FROM table1 WHERE foreign_key_id_column NOT IN (SELECT id FROM table2)

Table 1 has a column that you want to add the foreign key constraint to, but the values in the foreign_key_id_column don't all match up with an id in table 2.

  1. The initial select lists the ids from table1. These will be the rows we want to delete.
  2. The NOT IN clause in the where statement limits the query to only rows where the value in the foreign_key_id_column is not in the list of table 2 ids.
  3. The SELECT statement in parenthesis will get a list of all the ids that are in table 2.
zb226
  • 9,586
  • 6
  • 49
  • 79
Theo Voss
  • 271
  • 2
  • 3
  • @zb226: Your link to has to do with limits on the `IN` clause with a list of literal values. It doesn't apply to using an `IN` clause with the result of a sub-query. That accepted answer to that question actually solves the problem by using a sub-query. (A large list of literal values is problematic because it creates a huge SQL expression. A sub-query works fine because, even if the resulting list is large, the SQL expression itself is small.) – Kannan Goundan Feb 08 '19 at 00:26
  • Not a good option to use. Imagine if you are querying on huge data. let's say, a million of rows fetched and included in the subquery and compiler again has to match each row against all the ids returned in subquery. A big no from my side. – Jamshaid K. Mar 29 '21 at 13:29
13

Let we have the following 2 tables(salary and employee) enter image description here

Now i want those records from employee table which are not in salary. We can do this in 3 ways:

  1. Using inner Join
select * from employee
where id not in(select e.id from employee e inner join salary s on e.id=s.id)

enter image description here

  1. Using Left outer join
select * from employee e 
left outer join salary s on e.id=s.id  where s.id is null

enter image description here

  1. Using Full Join
select * from employee e
full outer join salary s on e.id=s.id where e.id not in(select id from salary)

enter image description here

spicy.dll
  • 948
  • 8
  • 23
Debendra Dash
  • 5,334
  • 46
  • 38
8

Where T2 is the table to which you're adding the constraint:

SELECT *
FROM T2
WHERE constrained_field NOT
IN (
    SELECT DISTINCT t.constrained_field
    FROM T2 
    INNER JOIN T1 t
    USING ( constrained_field )
)

And delete the results.

zb226
  • 9,586
  • 6
  • 49
  • 79
Karel
  • 89
  • 1
  • 2
4

From similar question here MySQL Inner Join Query To Get Records Not Present in Other Table I got this to work

SELECT * FROM bigtable 
LEFT JOIN smalltable ON bigtable.id = smalltable.id 
WHERE smalltable.id IS NULL

smalltable is where you have missing records, bigtable is where you have all the records. The query list all the records that not exist in smalltable but exists on the bigtable. You could replace id by any other matching criteria.

Luis H Cabrejo
  • 302
  • 1
  • 8
2

I Dont Knew Which one Is Optimized (compared to @AdaTheDev ) but This one seems to be quicker when I use (atleast for me)

SELECT id  FROM  table_1 EXCEPT SELECT DISTINCT (table1_id) table1_id FROM table_2

If You want to get any other specific attribute you can use:

SELECT COUNT(*) FROM table_1 where id in (SELECT id  FROM  table_1 EXCEPT SELECT DISTINCT (table1_id) table1_id FROM table_2);
William Desportes
  • 1,412
  • 1
  • 22
  • 31
jerinisready
  • 936
  • 10
  • 24
0

You could opt for Views as shown below:

CREATE VIEW AuthorizedUserProjectView AS select t1.username as username, t1.email as useremail, p.id as projectid, 
(select m.role from userproject m where m.projectid = p.id and m.userid = t1.id) as role 
FROM authorizeduser as t1, project as p

and then work on the view for selecting or updating:

select * from AuthorizedUserProjectView where projectid = 49

which yields the result as shown in the picture below i.e. for non-matching column null has been filled in.

[Result of select on the view][1]
-2

You can do something like this

   SELECT IFNULL(`price`.`fPrice`,100) as fPrice,product.ProductId,ProductName 
          FROM `products` left join `price` ON 
          price.ProductId=product.ProductId AND (GeoFancingId=1 OR GeoFancingId 
          IS NULL) WHERE Status="Active" AND Delete="No"
Hiren Makwana
  • 480
  • 5
  • 12
-4

SELECT * FROM First_table MINUS SELECT * FROM another

-6

How to select rows with no matching entry in Both table?


    select * from [dbo].[EmppDetails] e
     right join [Employee].[Gender] d on e.Gid=d.Gid
    where e.Gid is Null

    union 
    select * from [dbo].[EmppDetails] e
     left join [Employee].[Gender] d on e.Gid=d.Gid
    where d.Gid is Null