281

I have the following two tables:

1. Lecturers (LectID, Fname, Lname, degree).
2. Lecturers_Specialization (LectID, Expertise).

I want to find the lecturer with the most Specialization. When I try this, it is not working:

SELECT
  L.LectID, 
  Fname, 
  Lname 
FROM Lecturers L, 
     Lecturers_Specialization S
WHERE L.LectID = S.LectID
AND COUNT(S.Expertise) >= ALL (SELECT
  COUNT(Expertise)
FROM Lecturers_Specialization
GROUP BY LectID);

But when I try this, it works:

SELECT
  L.LectID,
  Fname,
  Lname 
FROM Lecturers L,
     Lecturers_Specialization S
WHERE L.LectID = S.LectID
GROUP BY L.LectID,
         Fname,
         Lname 
HAVING COUNT(S.Expertise) >= ALL (SELECT
  COUNT(Expertise)
FROM Lecturers_Specialization
GROUP BY LectID); 

What is the reason? Thanks.

Kaushal28
  • 5,377
  • 5
  • 41
  • 72
Adam Sh
  • 8,137
  • 22
  • 60
  • 75
  • 3
    Can you clarify which version of SQL you are using (MySQL, MS SQL, PostgreSQL, Oracle, etc.). Also, when you say "doesn't working", do you mean the results aren't as you expect, or that there is a compile/parse error? – jklemmack Feb 12 '12 at 22:20
  • 3
    Why do you use ALL instead of MAX?. Is there any advantage? – skan Sep 16 '14 at 20:17

9 Answers9

457

WHERE clause introduces a condition on individual rows; HAVING clause introduces a condition on aggregations, i.e. results of selection where a single result, such as count, average, min, max, or sum, has been produced from multiple rows. Your query calls for a second kind of condition (i.e. a condition on an aggregation) hence HAVING works correctly.

As a rule of thumb, use WHERE before GROUP BY and HAVING after GROUP BY. It is a rather primitive rule, but it is useful in more than 90% of the cases.

While you're at it, you may want to re-write your query using ANSI version of the join:

SELECT  L.LectID, Fname, Lname
FROM Lecturers L
JOIN Lecturers_Specialization S ON L.LectID=S.LectID
GROUP BY L.LectID, Fname, Lname
HAVING COUNT(S.Expertise)>=ALL
(SELECT COUNT(Expertise) FROM Lecturers_Specialization GROUP BY LectID)

This would eliminate WHERE that was used as a theta join condition.

Alexander
  • 105,104
  • 32
  • 201
  • 196
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Does it reduce performance if one uses where after group by? – theprogrammer Jul 06 '21 at 02:03
  • 3
    Why does everyone explain `HAVING` through the aggregates? It's confusing and not a full answer. `HAVING` just runs later, on the results set. We don't need aggregates to need `HAVING`. Sometimes we compute column value and then need `HAVING` to limit our results set to certain values, for example. – Robo Robok Oct 06 '22 at 12:52
  • @RoboRobok This is an explanation of having-vs-where as it applies to the OP's question with a query that has an aggregation. It's not an explanation of having-vs-where in general, even though the title of the question may suggest otherwise :-) – Sergey Kalinichenko Oct 06 '22 at 13:12
  • 2
    @SergeyKalinichenko okay, fair enough. Yet even most tutorials explain `HAVING` vs. `WHERE` this way: "use `HAVING` if you have aggregates". I remember having (pun unintended) problems understanding how `HAVING` actually works because of these poor explanations. – Robo Robok Oct 06 '22 at 13:17
  • @RoboRobok I use aggregation for the initial explain of `having` to people with no prior exposure to the concept because the motivation for the feature is easy to see (you can't filter by total before you add things up). Once they get it, I generalize to the before-and-after part. But I agree with you, aggregation to explaining having is what factorial is to explaining recursion: if you stop there, you ruin the concept for your students. – Sergey Kalinichenko Oct 06 '22 at 13:35
  • @SergeyKalinichenko totally, especially that `HAVING` is beautifully simple. – Robo Robok Oct 06 '22 at 13:40
68

First we should know the order of execution of Clauses i.e FROM > WHERE > GROUP BY > HAVING > DISTINCT > SELECT > ORDER BY. Since WHERE Clause gets executed before GROUP BY Clause the records cannot be filtered by applying WHERE to a GROUP BY applied records.

"HAVING is same as the WHERE clause but is applied on grouped records".

first the WHERE clause fetches the records based on the condition then the GROUP BY clause groups them accordingly and then the HAVING clause fetches the group records based on the having condition.

Pardhu
  • 1,789
  • 14
  • 17
  • 1
    Is this order of operations always used? What if the query optimizer changes the order? – MSIS Dec 27 '19 at 22:20
  • 7
    @MSIS even if the query optimizer changes the order, the outcome should be the same *as if* this order was followed. It's a logical order. – Stephen Jan 21 '20 at 19:00
44

HAVING operates on aggregates. Since COUNT is an aggregate function, you can't use it in a WHERE clause.

Here's some reading from MSDN on aggregate functions.

Daniel Mann
  • 57,011
  • 13
  • 100
  • 120
30
  1. WHERE clause can be used with SELECT, INSERT, and UPDATE statements, whereas HAVING can be used only with SELECT statement.

  2. WHERE filters rows before aggregation (GROUP BY), whereas HAVING filter groups after aggregations are performed.

  3. Aggregate function cannot be used in WHERE clause unless it is in a subquery contained in HAVING clause, whereas aggregate functions can be used in HAVING clause.

Source

zhulien
  • 5,145
  • 3
  • 22
  • 36
12

Didn't see an example of both in one query. So this example might help.

  /**
INTERNATIONAL_ORDERS - table of orders by company by location by day
companyId, country, city, total, date
**/

SELECT country, city, sum(total) totalCityOrders 
FROM INTERNATIONAL_ORDERS with (nolock)
WHERE companyId = 884501253109
GROUP BY country, city
HAVING country = 'MX'
ORDER BY sum(total) DESC

This filters the table first by the companyId, then groups it (by country and city) and additionally filters it down to just city aggregations of Mexico. The companyId was not needed in the aggregation but we were able to use WHERE to filter out just the rows we wanted before using GROUP BY.

Nhan
  • 1,405
  • 1
  • 13
  • 16
  • it's not a good example as you could convert : ` WHERE companyId = 884501253109 GROUP BY country, city HAVING country = 'MX' ` to: ` WHERE companyId = 884501253109, country = 'MX' GROUP BY city ` – Etienne Herlaut Nov 08 '19 at 13:23
  • If just moving the [country] filtering to the WHERE has you've suggested, the query would error from SELECT [country], as [country] is no longer included in the GROUP BY aggregation, thus cannot be selected. – Nhan Nov 08 '19 at 22:40
  • Your point on optimization is taken on moving [country] to the WHERE as that would be a smaller data set to GROUP BY with later. Of course this is just an example to illustrate possible uses. We can change to HAVING sum(total) > 1000 and that would be a completely valid case to include WHERE and HAVING. – Nhan Nov 08 '19 at 22:50
9

You can not use where clause with aggregate functions because where fetch records on the basis of condition, it goes into table record by record and then fetch record on the basis of condition we have give. So that time we can not where clause. While having clause works on the resultSet which we finally get after running a query.

Example query:

select empName, sum(Bonus) 
from employees 
order by empName 
having sum(Bonus) > 5000;

This will store the resultSet in a temporary memory, then having clause will perform its work. So we can easily use aggregate functions here.

Akash5288
  • 1,919
  • 22
  • 16
  • 2
    I think we can't use HAVING clause without GROUP BY clause. Position of HAVING Clause - SELECT -> FROM ->WHERE ->GROUP BY -> HAVING -> ORDER BY – Morez Mar 06 '17 at 15:42
  • 1
    @Morez as mentioned in MySQL 8.0 reference manual(https://dev.mysql.com/doc/refman/8.0/en/select.html), HAVING can be used without GROUP BY clause. Following is quoted from the reference manual link: The HAVING clause specifies conditions on groups, typically formed by the GROUP BY clause. The query result includes only groups satisfying the HAVING conditions. (If no GROUP BY is present, all rows implicitly form a single aggregate group.) – fishstick May 22 '21 at 04:44
4

1. We can use aggregate function with HAVING clause not by WHERE clause e.g. min,max,avg.

2. WHERE clause eliminates the record tuple by tuple HAVING clause eliminates entire group from the collection of group

Mostly HAVING is used when you have groups of data and WHERE is used when you have data in rows.

2

WHERE clause is used to eliminate the tuples in a relation,and HAVING clause is used to eliminate the groups in a relation.

HAVING clause is used for aggregate functions such as MIN,MAX,COUNT,SUM .But always use GROUP BY clause before HAVING clause to minimize the error.

Doddi girish
  • 75
  • 1
  • 7
2

Both WHERE and HAVING are used to filter data. In case of a WHERE statement, data filtering happens before you pull the data for operation.

SELECT name, age 
FROM employees
WHERE age > 30;

Here the WHERE clause filters rows before the SELECT operation is performed.

SELECT department, avg(age) avg_age
FROM employees
GROUP BY department
HAVING avg_age> 35;

HAVING filters the data after the SELECT operation is performed. Here the operation of computing (aggregation) is done first and then a filter is applied to the result using a HAVING clause.

Mithun Manohar
  • 516
  • 1
  • 6
  • 18