2

I have read that placing conditions in WHERE or in JOIN does not matter. However, I am running into a result that sounds a bit fishy: the results differ depending on where I put the condition. Could anyone can explain its underlying logic to me?

I have two dummy tables topic and issues. An issue has a title and a relationship 1-n with topic, so that every issue is related to a topic. Then, the issue has a title that can be repeated across topics. Its structure is quite bad but I cannot change it :/

All of this is in SQL Fiddle.

  <topic>                 <issues>

id |  name           id | topic_id | title
------------         ---------------------
1  |    art           1 |     1    | final
2  |  music           2 |     1    | semi final
3  |  sport           3 |     2    | final
                      4 |     2    | draft

Now I want to select how many times a topic appears in a list of issues, getting a 0 for when there is none. Using LEFT JOIN makes it, with the help of How to include “zero” / “0” results in COUNT aggregate?:

SELECT
    t.name, count(i.title) 
FROM 
    topic as t
LEFT JOIN issues as i 
    ON t.id = i.topic_id 
GROUP BY t.id;

This returns the expected result:

name   | count
--------------
art    |    2
music  |    2
sport  |    0

Now I want to know how many times a given title appears across topics. For example, how many times "final" happens for every topic? And here is where the problem appears:

SELECT
    t.name, count(i.title) 
FROM 
    topic as t
LEFT JOIN issues as i 
    ON t.id = i.topic_id 
       AND i.title = "final"   -- in the LEFT JOIN!
GROUP BY t.id;

Returns:

name   | count
--------------
art    |    1
music  |    1
sport  |    0                  -- yeah, sport - 0 is here

Whereas

SELECT
    t.name, count(i.title) 
FROM 
    topic as t
LEFT JOIN issues as i 
    ON t.id = i.topic_id 
WHERE
    i.title = "final"          -- in the WHERE!
GROUP BY t.id;

Returns

name   | count
--------------
art    |    1
music  |    1                 -- where is sport - 0?
Community
  • 1
  • 1
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • 6
    left join with where condition makes it inner join and will make sure that the joining table has data while doing the aggregation. – Abhik Chakraborty Aug 19 '16 at 08:23
  • what @AbhikChakraborty said. You could try with WHERE (i.title IS NULL OR i.title = "final") – ignasi Aug 19 '16 at 08:24
  • 1
    To be precise, only WHERE condition involving columns from the right table of LEFT JOIN effectively makes it INNER. – Serg Aug 19 '16 at 08:43
  • 1
    Simplifying: In the first query you say, pull all record from topic and then match with issues, generate null in issues where there is no match. In the second query you say the same but you also say filter result set where title is final. – The Shooter Aug 19 '16 at 08:53

4 Answers4

3
WHERE
    i.title = "final" 

also excludes rows where i.title is NULL. So when LEFT JOIN has no corresponding values in the i table the WHERE excludes the rows.

WHERE
    i.title = "final" or i.title is null

leaves the rows

StanislavL
  • 56,971
  • 9
  • 68
  • 98
  • Uhms, interesting! So the `WHERE` condition kind of overpasses the `LEFT JOIN` joining all the values in the other table? – fedorqui Aug 19 '16 at 08:35
  • Yes. Tables first joined and a joined rows are created. Then condition of WHERE is applied – StanislavL Aug 19 '16 at 09:03
  • @fedorqui "So the WHERE condition kind of overpasses the LEFT JOIN joining all the values in the other table" is unintelligible. – philipxy Aug 20 '16 at 07:42
2

As said in the comments, your final query it is indeed an inner join, with the left join counts returns 0 if there is no match in the left table, but when you use where it becomes an inner join that only returns a result where there is a match in both tables.

 /*Joins on the key values provided, if there is no value that 
 corresponds on the left table counts returns 0 */

select t.name, count(i.title) 
from topic as t
left join issues as i on t.id = i.topic_id 
group by t.id;

select t.name, count(i.title) 
from topic as t
left join issues as i on t.id = i.topic_id 
and i.title = "final"
group by t.id;  

/*Joins on the key values provided, if there is no value that corresponds on the left 
 table then count returns 0 and then filter by the title
    */
select t.name, count(i.title) 
from topic as t
left join issues as i on t.id = i.topic_id 
where i.title = "final"
group by t.id;

/*Solution*/
select t.name, count(i.title) 
from topic as t
left join issues as i on t.id = i.topic_id 
where (i.title = "final" || i.title is null)
group by t.id;
fedorqui
  • 275,237
  • 103
  • 548
  • 598
  • Thanks for the kind answer with a good explanation. It is now more clear to me what happened there. Isn't it strange, though, how a beautiful LEFT JOIN got to become an INNER JOIN? :D – fedorqui Aug 19 '16 at 08:55
  • @philipxy I did not say that. Check my comment again. – fedorqui Aug 24 '16 at 06:40
  • @fedorqui I didn't quote you word for word but via your rhetorical question you state join-became-join so I don't understand why you say you didn't say that. Anyway, several answers used that notion first & I was mostly interested in warning you about them. I've moved a lot of comments into my answer now so I'll clean up these and others soon. – philipxy Aug 24 '16 at 16:36
1

Try to move where condition on LEFT JOIN since this if use left join table's column in where condition this will behave like INNER JOIN and will returns only Matching records

SELECT
  t.name, count(i.title) 
FROM 
  topic as t
LEFT JOIN issues as i 
  ON t.id = i.topic_id AND i.title = "final"    
GROUP BY t.id;
Jaydip Jadhav
  • 12,179
  • 6
  • 24
  • 40
1

It's just not true that "placing conditions in WHERE or in JOIN does not matter". If a FROM has only inner JOINs at the top level (INNER JOIN ON, JOIN ON, CROSS JOIN) then you can move conditions between its inner JOINs and WHERE. But the same is not true for OUTER JOIN. See CROSS JOIN vs INNER JOIN in SQL Server 2008 re inner joins and Conditions in LEFT JOIN (OUTER JOIN) vs INNER JOIN re outer & inner joins together.

An OUTER JOIN ON returns what INNER JOIN ON returns plus possibly more rows formed by extending input rows by NULLs. If you remove those additional rows later via a WHERE condition then you could have instead just used an INNER JOIN ON (with or) without the WHERE condition or just used a CROSS JOIN with the WHERE condition.

A CROSS JOIN does a cross join of its argument tables, an (INNER) JOIN ON does a CROSS JOIN then drops rows failing its ON condition, and an outer JOIN ON does an INNER JOIN ON then per LEFT/RIGHT/FULL adds certain rows formed by NULL-extending rows failing its ON condition. Then after all the joins WHERE drops rows failing its condition. Then SELECT drops, adds and renames columns.

Community
  • 1
  • 1
philipxy
  • 14,867
  • 6
  • 39
  • 83
  • Uhms, interesting. This makes me wonder: is the `WHERE` condition evaluated after the `LEFT JOIN` is performed? – fedorqui Aug 23 '16 at 11:45
  • If you don't know that a WHERE is applied after all the joins in a FROM then you don't understand some absolute basics of SQL. Did you read the links in this answer? I have added a paragraph re how SELECT works. – philipxy Aug 23 '16 at 20:06
  • @fedorqui You recently edited this answer. Thanks for your interest. If you think my answer would be improved by change in content please comment. (The sentences cut are accurate & (critically) constructive.) – philipxy Aug 26 '16 at 14:42
  • Yes, that part is true and I take it in a constructive way, I just removed it because I don't think it is relevant to future readers. I appreciate all the time you took to teach me on this topic, thanks! – fedorqui Aug 28 '16 at 19:59