2

I saw answers to a related question, but couldn't really apply what they are doing to my specific case.

I have a large table (300k rows) that I need to join with another even larger (1-2M rows) table efficiently. For my purposes, I only need to know whether a matching row exists in the second table. I came up with a nested query like so:

SELECT 
  id, 
  CASE cnt WHEN 0 then 'NO_MATCH' else 'YES_MATCH' end as match_exists
FROM 
  (
   SELECT 
     A.id as id, count(*) as cnt
   FROM
     A, B
   WHERE 
     A.id = B.foreing_id
   GROUP BY A.id
  ) AS id_and_matches_count

Is there a better and/or more efficient way to do it?

Thanks!

Community
  • 1
  • 1
Tomato
  • 772
  • 8
  • 17

1 Answers1

3

You just want a left outer join:

SELECT 
   A.id as id, count(B.foreing_id) as cnt
FROM A
LEFT OUTER JOIN B ON
    A.id = B.foreing_id
GROUP BY A.id
  • actually, my attempt was incorrect. thanks for providing proper query, but is there more efficient way to do it considering the fact that I only need to know whether count(B.foreign_id) is = 0 or > 0? – Tomato Mar 27 '14 at 08:41
  • 1
    @Tomato, if there were a way to select only one row from B for each A, you could avoid the `GROUP BY`. Like, if the first row for B that exists were always labeled 1 in another column, you could add a condition for that and avoid grouping. Otherwise, this is pretty much it. –  Mar 27 '14 at 08:46
  • thanks a lot for pointers! There is no additional column in our current schema, thus I will stick with current approach for now. – Tomato Mar 27 '14 at 09:39