4

I have two tables both containing various companies. I want to select everything from tableOne that appears in tableTwo (in fact, only if it appears in tableTwo). I’ll do this by comparing their companyID fields. They must also have the same value for a column someYear. So in other words, I want to return the companies that appear in tableTwo from tableOne, if and only if their someYear columns match.

tableOne

companyID    someYear
---------    --------
1            2010
2            2009
3            2011
1            2011
7            2009


tableTwo

companyID    someYear
---------    --------
1            2010
2            2008
3            2011
4            2011
5            2009

I want to return 1 and 3 (company 2’s year differs so I don’t want it). My useless attempt:

SELECT one.* 
FROM tableOne one, tableTwo two
WHERE one.[companyID] in (
  SELECT DISTINCT companyID
  FROM tableTwo
)
and one.someYear = two.someYear;

The problem here is that it returns millions of rows, when both tables contain less than 10,000 entries. I can't recreate the output exactly because of sensitive information. The table/column names here are all made up as I'm sure you can tell. Also I should probably point out that neither table appears to have a primary key. I assume that causes them to have nothing to match on. I’m using SQL Server 2008. Any help would be hugely appreciated.

Ciarán Tobin
  • 7,306
  • 1
  • 29
  • 45

2 Answers2

12

Try this

select one.*
from tableOne as one
  inner join tableTwo as two
    on one.companyID = two.companyID and
       one.someYear = two.someYear
Mikael Eriksson
  • 136,425
  • 22
  • 210
  • 281
  • @Lamak I'm embarrassingly bad at anything involving a join, I should really go learn them properly. I just have a simple follow up question. This query is returning more rows than there are in `tableTwo` which I didn't expect at all. Looking at some of the returned rows I can see some duplicates. Obviously I can take these out using `DISTINCT`. I was just wondering though, why do these duplicates occur? They aren't duplicated in either table. For example, one company has two rows in `tableTwo`, but three are returned by this query - two of which are identical/duplicates. – Ciarán Tobin Aug 19 '11 at 14:31
  • 2
    @MadScone - If you have duplicate rows in tabelOne and one row that match that duplicate in tableTwo you will get one row in the result set for each row in tableOne. – Mikael Eriksson Aug 19 '11 at 14:43
4

INNER JOIN is not the only way of performing a semi-join. Here's another equally valid way of doing the same:

SELECT * 
  FROM tableOne one
 WHERE EXISTS (
               SELECT *
                 FROM tableTwo two
                WHERE one.companyID = two.companyID
                      AND one.someYear = two.someYear
              );
onedaywhen
  • 55,269
  • 12
  • 100
  • 138