2

I am new to SQL and doing the learning via datacamp. I was wondering if you can achieve the same result with 'HAVING' as with a nested 'WHERE' clause.

Related: SQL - having VS where I understand that HAVING is used with aggregate functions such as min, max, ..

How could I rewrite the following with HAVING?:

SELECT *
FROM populations
WHERE year = 2015 AND life_expectancy >(
   SELECT AVG(life_expectancy)*1.15
   FROM populations
);

Suppose I have 6 columns in the table 'populations': A (character), B (character), C (Character), D (number, i.e. life_expectancy as in the example), year and E (number).

I tried the following:

SELECT *
FROM populations
WHERE year = 2015
GROUP BY A, B, C, year
HAVING life_expectancy > AVG(life_expectancy)*1.15; 

However, this returns an empty table. I realise it's not a reproducible example, but perhaps a general explanation as to why it might not be working would suffice :)

Anonymous
  • 502
  • 4
  • 23
  • 1
    Stick to your sub-query, there's nothing wrong with it. – jarlh Apr 11 '18 at 07:43
  • @Anonymous There is another issue with that query: `SELECT * ... GROUP BY A,B,C, year`. Unless all other columns are functionally dependent on A,B,C, year and I guess they are not then the query will work only on MySQL with ONLY_FULL_GROUP_BY disabled. [More](https://stackoverflow.com/questions/33629168/group-by-clause-in-mysql-and-postgresql-why-the-error-in-postgresql/33629201#33629201) – Lukasz Szozda Dec 29 '19 at 05:33

4 Answers4

3

HAVING works on the results, subquery is a separate query. In your example the average you calculate is for the whole table. In the HAVING clause it affects the set of data that forms the result row. So in this case it basically means you want a value to be larger than itself multiplied by 1.15 which of course doesn’t work.

HAVING is usable when you aggregate data, for example:

SELECT country, sum(population) FROM cities
GROUP BY country
HAVING sum(population) > 100000

This groups all cities by country, then returns only the rows where sum of city populations is larger than the value.

Therefore subqueries and HAVING have a very different use cases and your query can’t be turned into HAVING since you’re not trying to compare values within groups.

Sami Kuhmonen
  • 30,146
  • 9
  • 61
  • 74
3

HAVING is generally used to limit the results of a GROUP BY clause, in the same way that a WHERE is used to limit the results of a SELECT clause.

In addition, while a WHERE clause cannot contain aggregates directly, you can either use a HAVING clause (with the aggregates), or you could use a sub-query in the WHERE clause which contains the aggregates.

Unless you are grouping your data, or otherwise aggregating it, then I cannot see why you would want to rewrite your WHERE clause as a HAVING clause.

In your example, you are creating a set of groups, based on A, B, C and Year. Then, you are limiting those groups to the ones where the life-expectancy of the group (whatever this is meant to be?) is greater than the average life-expectancy of that same group multiplied by 1.15. It doesn't make any sense, which is what you had already established.

Peter Abolins
  • 1,520
  • 1
  • 11
  • 18
  • 1
    Check your first sentence. HAVING is designed to permit filtering results based on aggregated information. It is not efficient to filter on unaggregated data in a having clause becuse that should be done before the grouping. – Paul Maxwell Apr 11 '18 at 07:52
  • Your whole answer seems to imply correctly that Having is specifically for use with aggregate functions, except your first sentence. – Magnus Apr 11 '18 at 08:06
  • Updated... hopefully I have removed some of the confusion I initially created. – Peter Abolins Apr 11 '18 at 08:35
1

You can rewrite a query with HAVING using a subquery. For example;

SELECT country, sum(population) sum_pop
  FROM cities
 GROUP BY country HAVING sum(population) > 100000
 ORDER BY sum(population) DESC;

is equivalent to:

SELECT country, sum_pop
  FROM (SELECT country, sum(population) sum_pop
          FROM cities
         GROUP BY country) subquery
 WHERE sum_pop > 100000
 ORDER BY sum_pop DESC;           
0

My two cents

Group By lets you make groups out of your complete result set so that you could perform some operation(aggregate function) whose(operations’) scope is limited to that group only and having lets you get rid of(or include) few groups based on the condition you place after having keyword and that condition can also be an aggregate function. Moreover, you can have a different aggregate function to execute on each group(in select clause) and different aggregate function in having clause to filter out a few groups. e.g. let’s say you have employees and their department data and you want to calculate Average salary of each department but if the department has less than 10 employees don’t consider that department. Here you would have to divide all employees into different groups based on department which is done by group by clause and then you would execute average operation whose scope is limited to each group. But before you execute an average operation on each group, you would like to exclude groups(i.e. departments) whose employee count is less than 10, for that, you would use a different aggregate operation(i.e. count) in having clause and that count aggregate operation is again limited to each group. e.g.

select 
department, Avg(salary) from employee 
group by department 
having count(*) > 10;

Back to your question, you tried making groups based on almost all columns that way you would end up having the number of groups almost equals to number rows in your table(i.e. each group has just one row) and then you tried to perform an aggregate function, AVG, in having clause and as said above, that aggregate function’s scope is limited to a group only. And each group has just one row. So it’s like you are trying to fetch a group(or groups) which now has just one row(because you included almost all columns in your group by clause) with column: x = 10 using the condition(in having clause): x > 10, so you won’t get any data in your fetch query.

Ankush G
  • 989
  • 9
  • 14