0

please take a look at below 2 queries regarding left outer join and tell me why there are differences. Query 1 returns 1489 rows:

SELECT distinct a.GMS_MATERIALNUMBER,a.MATERIAL_DESCRIPTION, b.LDMC
FROM [AP_GDC2_PREPARATION_TEST].[dbo].[GDM_AUTOPULL] a
left outer join  [AP_GDC2_STAGING_TEST].[dbo].[CFS_DIS_LDMC] b on
a.GMS_MATERIALNUMBER = b. GMS_MATERIALNUMBER and b.SAP_COMPANY_CODE= '1715'
and a.CFS_ORGANIZATION_CODE like 'rd_kr'

Query 2 returns only 295 rows which gives the same number of rows as when i do a simple select * from a where CFS_ORGANIZATION_CODE like 'rd_kr'

SELECT distinct a.GMS_MATERIALNUMBER,a.MATERIAL_DESCRIPTION, b.LDMC
FROM [AP_GDC2_PREPARATION_TEST].[dbo].[GDM_AUTOPULL] a
left outer join  [AP_GDC2_STAGING_TEST].[dbo].[CFS_DIS_LDMC] b on
a.GMS_MATERIALNUMBER = b. GMS_MATERIALNUMBER and b.SAP_COMPANY_CODE= '1715'
where a.CFS_ORGANIZATION_CODE like 'rd_kr'

Basically query 2 is the result i wanted, but my question is why query 1 does not work? how exactly does the SQL server work in the background when it comes to the ON clause in the left outer join ?

Cheers

praveen
  • 12,083
  • 1
  • 41
  • 49

2 Answers2

2

Both are literally different.

The first query does the filtering of table before the joining of tables will take place.

The second one filters from the total result after the joining the tables is done.


Here's an example

Table1

ID   Name
1    Stack
2    Over 
3    Flow

Table2

T1_ID   Score
1       10
2       20
3       30

In your first query, it looks like this,

SELECT  a.*, b.Score
FROM    Table1 a
        LEFT JOIN Table2 b
           ON a.ID = b.T1_ID AND
              b.Score >= 20

What it does is before joining the tables, the records of table2 are filtered first by the score. So the only records that will be joined on table1 are

T1_ID   Score
2       20
3       30

because the Score of T1_ID is only 10. The result of the query is

ID   Name    Score
1    Stack   NULL
2    Over    20
3    Flow    30

While the second query is different.

SELECT  a.*, b.Score
FROM    Table1 a
        LEFT JOIN Table2 b
           ON a.ID = b.T1_ID
WHERE   b.Score >= 20

It joins the records first whether it has a matching record on the other table or not. So the result will be

ID   Name    Score
1    Stack   10
2    Over    20
3    Flow    30

and the filtering takes place b.Score >= 20. So the final result will be

ID   Name    Score
2    Over    20
3    Flow    30
John Woo
  • 258,903
  • 69
  • 498
  • 492
0

The difference is because you made an LEFT JOIN. So you get all rows from your first table and all that match from your second table.

In the second query you JOIN first, and after you set your WHERE statement to reduce the result.

YvesR
  • 5,922
  • 6
  • 43
  • 70