200

I wanted to run a simple query to throw up all the rows of Table1 where a principal column value is not present in a column in another table (Table2).

I tried using:

SELECT * FROM Table1 WHERE Table1.principal NOT IN Table2.principal

This is instead throwing a syntax error. Google search led me to forums where people were saying that MySQL does not support NOT IN and something extremely complex needs to be used. Is this true? Or am I making a horrendous mistake?

Dominic Rodger
  • 97,747
  • 36
  • 197
  • 212
KJ Saxena
  • 21,452
  • 24
  • 81
  • 109
  • 1
    And what if I want similar data from three tables. I mean one table1 has 2000 entries, the other two tables 2&3 each have say 500 entries, all of them have common field 'name'. How can we get all the details from table 1 which are not present in table2&3 based on 'name'. Can we use NOT IN twice, if so how..? –  Aug 03 '11 at 05:16

6 Answers6

335

To use IN, you must have a set, use this syntax instead:

SELECT * FROM Table1 WHERE Table1.principal NOT IN (SELECT principal FROM table2)
Julien Lebosquain
  • 40,639
  • 8
  • 105
  • 117
  • 106
    Careful when `table2.principal` can be `NULL`. In that case `NOT IN` will always return `FALSE` because `NOT IN` is treated as `<> ALL`, which compares all rows from the subquery like `Table1.principal <> table2.principal`, which fails when comparing with `NULL`: `Table1.principal <> NULL` will not result in `TRUE`. To fix: `NOT IN (SELECT principal FROM table2 WHERE principal IS NOT NULL)`. – Basti Jul 28 '15 at 12:07
  • 5
    Thanks for the comment @Basti! Spent a lot of time trying to understand why the query wasn't working as expected. – gvas Mar 23 '16 at 02:17
  • 3
    Don't forget to avoid using 'SELECT *' inside the 'NOT IN' list. You must choose a particular column. Otherwise you'll get this error: https://stackoverflow.com/questions/14046838/mysql-operand-should-contain-1-columns – Lorien Brune Jun 01 '17 at 22:16
  • @Basti, Thank you, my SQL query worked with your suggestion. – amir22 Sep 08 '21 at 13:47
173

The subquery option has already been answered, but note that in many cases a LEFT JOIN can be a faster way to do this:

SELECT table1.*
FROM table1 LEFT JOIN table2 ON table2.principal=table1.principal
WHERE table2.principal IS NULL

If you want to check multiple tables to make sure it's not present in any of the tables (like in SRKR's comment), you can use this:

SELECT table1.*
FROM table1
LEFT JOIN table2 ON table2.name=table1.name
LEFT JOIN table3 ON table3.name=table1.name
WHERE table2.name IS NULL AND table3.name IS NULL
Lukáš Lalinský
  • 40,587
  • 6
  • 104
  • 126
36

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL in MySQL

MySQL, as well as all other systems except SQL Server, is able to optimize LEFT JOIN / IS NULL to return FALSE as soon the matching value is found, and it is the only system that cared to document this behavior. […] Since MySQL is not capable of using HASH and MERGE join algorithms, the only ANTI JOIN it is capable of is the NESTED LOOPS ANTI JOIN

[…]

Essentially, [NOT IN] is exactly the same plan that LEFT JOIN / IS NULL uses, despite the fact these plans are executed by the different branches of code and they look different in the results of EXPLAIN. The algorithms are in fact the same in fact and the queries complete in same time.

[…]

It’s hard to tell exact reason for [performance drop when using NOT EXISTS], since this drop is linear and does not seem to depend on data distribution, number of values in both tables etc., as long as both fields are indexed. Since there are three pieces of code in MySQL that essentialy do one job, it is possible that the code responsible for EXISTS makes some kind of an extra check which takes extra time.

[…]

MySQL can optimize all three methods to do a sort of NESTED LOOPS ANTI JOIN. […] However, these three methods generate three different plans which are executed by three different pieces of code. The code that executes EXISTS predicate is about 30% less efficient […]

That’s why the best way to search for missing values in MySQL is using a LEFT JOIN / IS NULL or NOT IN rather than NOT EXISTS.

(emphases added)

Jari Keinänen
  • 1,361
  • 1
  • 21
  • 43
engin
  • 555
  • 4
  • 13
10

Be carefull NOT IN is not an alias for <> ANY, but for <> ALL!

http://dev.mysql.com/doc/refman/5.0/en/any-in-some-subqueries.html

SELECT c FROM t1 LEFT JOIN t2 USING (c) WHERE t2.c IS NULL

cant' be replaced by

SELECT c FROM t1 WHERE c NOT IN (SELECT c FROM t2)

You must use

SELECT c FROM t1 WHERE c <> ANY (SELECT c FROM t2)
SysDragon
  • 9,692
  • 15
  • 60
  • 89
user4554358
  • 101
  • 1
  • 2
7

Unfortunately it seems to be a issue with MySql usage of "NOT IN" clause, the screen-shoot below shows the sub-query option returning wrong results:

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 1.1.8                        |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.5.21                       |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
+-------------------------+------------------------------+
7 rows in set (0.07 sec)

mysql> select count(*) from TABLE_A where TABLE_A.Pkey not in (select distinct TABLE_B.Fkey from TABLE_B );
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.07 sec)

mysql> select count(*) from TABLE_A left join TABLE_B on TABLE_A.Pkey = TABLE_B.Fkey where TABLE_B.Pkey is null;
+----------+
| count(*) |
+----------+
|      139 |
+----------+
1 row in set (0.06 sec)

mysql> select count(*) from TABLE_A where NOT EXISTS (select * FROM TABLE_B WHERE TABLE_B.Fkey = TABLE_A.Pkey );
+----------+
| count(*) |
+----------+
|      139 |
+----------+
1 row in set (0.06 sec)

mysql> 
Legna
  • 1,632
  • 15
  • 15
0

The answers above are correct but difficult to understand. In order to make it clearer, I will give an answer based on an example. Let's say you want to select all the people who are not in the outstanding table:

option 1:

 select *
 from persons p
 where not exists(select * 
                 from outstandings o
                 where p.id = o.id)

option 2:

    select *
    from persons p
    where p.id not in( select m.id
                      from outstandings o
                      where m.id = o.id)

option 3:

    select *
    from persons p
    left join utstandings o on o.id=p.id
    where o.id in null
RivkaZ
  • 41
  • 3