-2

In MySQL version 8 series, there is SQL that uses aggregation functions such as count. LEFT JOIN is performed, and the pattern with "IS NULL" specified in the WHERE clause behaves differently than the pattern with "IS NULL" specified in the ON clause. Why?

-- pattern1
SELECT count(a.value)
FROM A AS a
LEFT JOIN B AS b
ON a.id = b.id
WHERE b.value IS NULL
;


-- pattern2
SELECT count(a.value)
FROM A AS a
LEFT JOIN B AS b
ON a.id = b.id
AND b.value IS NULL
;


-- pattern1's result = 3549
-- pattern2's result = 4000

A table has 4000 records.

GMB
  • 216,147
  • 25
  • 84
  • 135
上西潤
  • 21
  • 5
  • 2
    Please share sample data and the differences that you've seen – Nico Haase Apr 28 '23 at 07:51
  • [What is the difference between "INNER JOIN" and "OUTER JOIN"?](https://stackoverflow.com/a/46091641/3404097) – philipxy Apr 28 '23 at 12:00
  • 1
    so people are answering your literal question "Why?". but what you aren't asking and maybe should is how to get the result you want; for that you would need to describe what you want. – ysth Apr 28 '23 at 16:54

4 Answers4

3

When the condition by one table is placed in WHERE then it is applied after the joining.

When the condition by one table is placed in ON then it is applied during (but this looks like before) the joining.


CREATE TABLE t1 
SELECT 1 id, 1 val UNION ALL
SELECT 2, 2 UNION ALL
SELECT 3, 3;

CREATE TABLE t2 
SELECT 1 id, 1 val UNION ALL
SELECT 2, 2 UNION ALL
SELECT 4, 4 UNION ALL
SELECT NULL, 5;
SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id;
id val id val
1 1 1 1
2 2 2 2
3 3 null null

First SELECT joins the tables without additional condition.

SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE t2.id IS NULL;
id val id val
3 3 null null

Second SELECT contains additional condition by right table in WHERE. This condition is applied after joining, i.e. to the rowset which is returned by first SELECT, and only one row is returned.

SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
            AND t2.id IS NULL;

SELECT * FROM t2 WHERE t2.id IS NULL;
id val id val
1 1 null null
2 2 null null
3 3 null null
id val
null 5

Third SELECT contains additional condition by right table in ON, which is applied to the source table. This is shown additionally.

SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE t1.id > 1;
id val id val
2 2 2 2
3 3 null null

Forth SELECT contains additional condition by left table in WHERE. This condition is applied after joining, i.e. to the rowset which is returned by first SELECT, and only two rows is returned.

SELECT *
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
            AND t1.id > 1;

SELECT * FROM t1 WHERE t1.id > 1;
id val id val
1 1 null null
2 2 2 2
3 3 null null
id val
2 2
3 3

Fifth SELECT contains additional condition by left table in ON, which is applied to the source table. This is shown additionally.

fiddle

Akina
  • 39,301
  • 5
  • 14
  • 25
  • 1
    @上西潤 Aggregate function is applied after joining and grouping, i.e. it processes only those rowset which is obtained after joining. In other words, aggregation cannot affect the joining result. – Akina Apr 28 '23 at 12:01
1

With it in the on clause, you get at least one result row per a row, with b rows joins whenever b.value is null

With it in the where clause, you skip any a rows where one or more b rows exist and the b.values are all not null.

ysth
  • 96,171
  • 6
  • 121
  • 214
0

The first one applied the b.value IS NULL condition durin the join, whereas the second one applies the b.value IS NULL to final result set.

See dbfiddle.

slaakso
  • 8,331
  • 2
  • 16
  • 27
0

When you use WHERE b.value IS NULL, the output will only have any records having b.value = NULL. I.e. no matching records are found in B OR matched B record has value = NULL. Some records from table A might not end up in the result that are matching and have b.value not null.

While, if you use AND b.value IS NULL, that means, from the B table, ONLY THOSE records will be selected as RIGHT table output where the value is null. But from A table, ALL the records will result in the output - including those having b.value NOT NULL.

So, Pattern 2 may give MORE results in output.

Ishan
  • 400
  • 2
  • 8