29

Anyone knows how to perform such query in Postgresql?

SELECT * 
FROM tabA 
WHERE NOT EXISTS (
    SELECT * 
    FROM tabB 
    WHERE tabB.id = tabA.id
)

When I execute such query, postgresql complains "ERROR: Greenplum Database does not yet support that query."

EDIT: And how about this one:

SELECT * 
FROM tabA 
WHERE NOT EXISTS (
    SELECT * 
    FROM tabB WHERE tabB.id = tabA.id AND tabB.id2 = tabA.id2
)

EDIT:
I tested in postgresql 8.2.15 for the 4 answers provided by @ypercube. Conclusions are:

1) The first does not work in this version of postgresql, as I said above in the question. The error message can be found there too.

2) For the other three answers, the execution speed is: (3)LEFT JOIN > (4)EXCEPT >> (2)NOT IN.
Specifically, for queries that have the same syntax, (3)LEFT JOIN takes about 5580ms, (4)EXCEPT takes about 13502ms, and (2)NOT IN takes more than 100000 (In fact I did not wait util it finished).
Is there any particular reasons for NOT IN clause to be so slow?
Cheng

cheng
  • 2,106
  • 6
  • 28
  • 36
  • 2
    PostgreSQL 8.2 is old and no longer supported. There were major changes in how `EXISTS` and `NOT EXISTS` queries work since then. http://www.postgresql.org/support/versioning/ – kgrittn Jun 30 '12 at 13:38
  • If you want to find the most efficient of the queries, I think you first have to check the indexes you have on the tables. You don't mention the size of the tables but 5 seconds for a query means (99%) either huge tables or lack of indexes. I suggest you add a new question, including the definitions (`CREATE TABLE`) of the two tables, the queries and the execution plans. – ypercubeᵀᴹ Jun 30 '12 at 13:58

4 Answers4

41

There are 3 (main) ways to do this kind of query:

  1. NOT EXISTS correlated subquery

  2. NOT IN subquery

  3. LEFT JOIN with IS NULL check:

You found that the first way does work in Greenplum. @Marco and @juergen provided the 2nd way. Here's the 3rd one, it may bypass Greenplum's limitations:

SELECT tabA.* 
FROM 
    tabA 
  LEFT JOIN 
    tabB 
      ON  tabB.id = tabA.id 
      AND tabB.id2 = tabA.id2
WHERE tabB.id IS NULL ;

This (4th way) also works in Postgres (which supports EXCEPT operator):

SELECT a.*
FROM a
WHERE id IN
      ( SELECT id
        FROM a
      EXCEPT
        SELECT id
        FROM b
      ) ; 

Tested in SQL-Fiddle (that all 4 work in Postgres).

ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
  • No, It doesn't. I think the reason is the filter "tabB.id IS NULL" is applied before LEFT JOIN, not after LEFT JOIN. – cheng Jun 28 '12 at 06:30
  • How to implement this query with NOT EXISTS? Can you offer any help? – cheng Jun 28 '12 at 06:31
  • As the error say, it is not "yet" supported by Greenplum. No help can be given. Since, the `NOT IN` query works, use that. – ypercubeᵀᴹ Jun 28 '12 at 06:32
  • Suppose I have two tables: UserInfo and UserOrder. I want to find users with no orders. This query "SELECT * FROM UserInfo LEFT JOIN UserOrder ON UserInfo.userid = UserOrder.userid AND UserOrder.orderid IS NULL" returns all the rows in UserInfo. In my tables there is exact one row in UserOrder and multiple rows for different users in UserInfo. – cheng Jun 28 '12 at 06:41
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/13139/discussion-between-cheng-and-ypercube) – cheng Jun 28 '12 at 06:42
  • I tested the four answer in postgresql 8.2.15. The conclusions are list in my question (The "Edit" part). Thank you for sharing this with me and others. – cheng Jun 30 '12 at 12:34
  • Perfect, I'm now using NOT EXISTS, which gives most stable results. And `NOT IN` subquery tends to be disasterous slow with low `work_mem` settings (or big tables). – denis.peplin Feb 16 '16 at 08:56
6

The part of the error you left out could have pointed you in the right direction. I think it said "DETAIL: The query contains a correlated subquery." So you have to rewrite these with joins or non-correlated subqueries.

SELECT * FROM tabA WHERE id NOT IN (SELECT id FROM tabB);

As for the second query, try

SELECT * FROM tabA WHERE (id, id2) NOT IN (SELECT id, id2 FROM tabB);
Marco Mariani
  • 13,556
  • 6
  • 39
  • 55
  • Thank you for replying so soon. I updated my question. What about the new query? – cheng Jun 28 '12 at 05:46
  • Yes, it says "The query contains a correlated subquery". This kind of query is directly supported by mysql. I thought postgresql also supports it. – cheng Jun 28 '12 at 05:50
  • postgres does, with a lot more complex stuff, but greenplum doesn't because it has to drop functionality for performance reasons. – Marco Mariani Jun 28 '12 at 05:52
  • Anyway, "SELECT * FROM tabA WHERE (id, id2) NOT IN (SELECT id, id2 FROM tabB)" Works. Thank you. – cheng Jun 28 '12 at 05:55
3
SELECT * FROM tabA 
WHERE id not in  (SELECT id FROM tabB)
juergen d
  • 201,996
  • 37
  • 293
  • 362
0

Four Ways to do ----
For this scenario create


Create Table A with name and subject

create table a (
name varchar (30),
subject varchar (40)
);

Insert Value

Insert into a values ('abdul', 'math'), ('mashood','science'), ('rehan','urdu'), ('saqib','art');

Create Table B with name and subject

create table b(
name varchar(30),
subject varchar(40)
);

Insert Value

Insert into b values ('abdul', 'math'), ('mashood','hindi'), ('rehan','farsi'), ('saqib','art'),('xyz','math');

Select from table A:

name subject
abdul math
mashood science
rehan urdu
saqib art

Select from table B:

name subject
abdul math
mashood hindi
rehan farsi
saqib art
xyz math
  1. Not IN
select name,subject from b where name not in ( select name from a where a.name=b.name and a.subject=b.subject );
name subject
mashood hindi
rehan farsi
xyz math
  1. Not Exists
select name,subject from b where not exists ( select name from a where a.name=b.name and a.subject=b.subject );
name subject
mashood hindi
rehan farsi
xyz math
  1. Join
SELECT b.* 
FROM b left JOIN a 
ON  b.name = a.name
AND b.subject = a.subject
WHERE a.name IS  NULL ;
name subject
mashood hindi
rehan farsi
xyz math
  1. Except
SELECT bm.* FROM b bm
WHERE subject IN 
( SELECT subject FROM b where name=bm.name
EXCEPT
SELECT subject FROM a where name=bm.name
) ;
name subject
mashood hindi
rehan farsi
xyz math