5

I'm new to SQL, and havn't been able to get this SQL query right yet. I currently have:

SELECT * FROM tableA
LEFT OUTER JOIN tableB
ON tableA.`full_name` = tableB.`full_name`
WHERE tableB.`id` IS NULL

Both tables have records of people, complete with names and addresses. I need to get ALL the records for those who are in tableA, but not tableB. The diagram below is basically what I need:

Ven Diagram

The problem is that two people may have the same name, but different addresses. So ultimately, I need to get the records of ALL the people that are in tableA, excluding the duplicates that have duplicate names AND addresses.

Each table has columns as follows:

id,full_name,first_name,last_name,title,phone,address,city,state,postal_code
Kai
  • 38,985
  • 14
  • 88
  • 103
mdance
  • 966
  • 5
  • 16
  • 36
  • Could you show the table structure of the tables so we don't have to guess the names? :) – Joachim Isaksson May 30 '13 at 19:21
  • lol, yes. Good Idea. :) I will edit the question. – mdance May 30 '13 at 19:24
  • Are you saying you need DISTINCT name and address combinations? Or do you need to pick one address for people that have multiple addresses? – Hart CO May 30 '13 at 19:39
  • I need DISTINCT name and address combinations. EX: There can be two people with the same name, but we know they are different individuals because they have different addresses. So, I need to retain those that may have the same name, but different addresses. – mdance May 30 '13 at 19:41

5 Answers5

9

The following query will give you all the ids of people in tableA that are not in tableB based on full name and adress:

SELECT tableA.id FROM tableA
LEFT OUTER JOIN tableB
-- people are the same if fullname and adress match
ON tableA.`full_name` = tableB.`full_name`
   AND tableA.adress = tableB.adress
-- filter people that re in tableA only
WHERE tableB.`id` IS NULL
-- filter duplicates
GROUP BY tableA.id

You can easily edit this selet to include whatever information you need from tableA.

Friederike
  • 1,252
  • 15
  • 29
6

Since you're joining on two fields you're options are an ANTI-JOIN (Friederike S' answer) Not exists,

SELECT DISTINCT tablea.* 
FROM   tablea 
WHERE  NOT EXISTS (SELECT * 
                   FROM   tableb 
                   WHERE  tablea.`full_name` = tableb.`full_name`
                          AND tableA.adress = tableB.adress) 

DEMO

You can also use not in see (Christian Ammer's) answer

Another more obscure solution is to use the ALL keyword. It's very similar to NOT IN

SELECT DISTINCT tablea.* 
FROM   tablea 
WHERE 

( tablea.`full_name`  , tableA.address) 

  !=  ALL (SELECT tableb.`full_name`, tableB.address 
           FROM tableb)

DEMO

Community
  • 1
  • 1
Conrad Frix
  • 51,984
  • 12
  • 96
  • 155
  • 1
    Your answer worked as well, but selected Friederike's answer because it was first. Thanks – mdance May 30 '13 at 20:30
  • 2
    @mdance One extra note. Depending on the nullabitlity of the columns involved you'll get difference performance characteristics between the two. See Quassnoi's article [LEFT JOIN / IS NULL vs. NOT IN vs. NOT EXISTS: nullable columns](http://planet.mysql.com/entry/?id=24888) – Conrad Frix May 30 '13 at 20:38
  • Another option is NOT IN (see [my answer](http://stackoverflow.com/a/16846165/237483)) which doesn't need a correlated subquery like the NOT EXISTS option. – Christian Ammer May 30 '13 at 21:34
  • @ChristianAmmer Thanks I totally forgot about weirdness with MySQL's IN so I updated my answer. You should be careful however about the implication about avoiding correlated subqueries for perf reasons. As Quassanoi's article points out the Nullablitiy of the fields involved dictate which solution you should use. – Conrad Frix May 30 '13 at 22:00
  • There is no weirdness with MySQL (at least, not in this case), you can include more then one column with subqueries (says the standard, you can check it with the [Mimer SQL-92 validator](http://developer.mimer.com/validator/parser92/index.tml#parser)) – Christian Ammer May 31 '13 at 07:05
  • Yep you're right. The when I checked DB Support for this. I looked at SQL Server and it doesn't support it so I figured it was an Added by MySQL. But oracle and postrgres also support it. – Conrad Frix May 31 '13 at 15:12
  • Not in and Not exists are not good practice. I like the answer by 'Friederike'. Nicely explained. – Sam Apr 02 '18 at 14:02
  • @Sam I don't know what you mean by "not good practice" since [you didn't give a rationale](https://softwareengineering.stackexchange.com/a/14895/162) but you may want to review the article I mentioned [in the earlier comment](https://stackoverflow.com/questions/16844206/get-records-from-table-a-that-are-not-in-table-b/16844959?noredirect=1#comment24291835_16844959) since there can be a perf difference, although it may have changed in the last 5 years. – Conrad Frix Apr 18 '18 at 15:48
3

In my opinion it's logical to write such a query with a subquery instead of a JOIN. And because it's a non-correlated subquery, there shouldn't be a performance difference to the JOIN Version.

SELECT *
FROM tableA
WHERE (full_name, address) NOT IN
(SELECT full_name, address FROM tableB);

You can view the result on SQL Fiddle.

Also have a look at this answers:

Community
  • 1
  • 1
Christian Ammer
  • 7,464
  • 6
  • 51
  • 108
0

Have you checked this excellent page? http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html

SELECT * FROM TableA LEFT OUTER JOIN TableB ON TableA.name = TableB.name WHERE TableB.id IS null
  • This is where I got what I have in the first place - but it doesn't solve my problem of having people with the same name, but different addresses. – mdance May 30 '13 at 19:27
  • lol yea. That image helps to diagram what I need - for the most part. – mdance May 30 '13 at 19:28
0

You can use the NOT EXIST clause. This condition will return all records from the tableA that have no records in the tableB for the given full_name.

   SELECT * FROM 
   tableA
   WHERE 
   NOT EXISTS (select * from tableB Where tableA.full_name = tableB.full_name);