19

I could write a query using an aggregate function in two ways:

select team, count(min) as min_count
from table
group by team
having count(min) > 500

or

select * 
from (
    select team, count(min) as min_count
    from table
    group by team
) as A
where A.min_count > 500

Are there any performance benefits to either approach or are they functionally the same thing?

ferics2
  • 5,241
  • 7
  • 30
  • 46
  • 1
    Most importantly: Are they semantically identical? – usr Oct 17 '12 at 21:50
  • did you test them or review the explain plan? – Taryn Oct 17 '12 at 21:51
  • Can you show the query execution plan? – Tim Schmelter Oct 17 '12 at 21:51
  • 1
    This is a question that needs to be answered in terms of general principles. Measuring a single case is not what is asked here. It proves nothing about other cases. – usr Oct 17 '12 at 21:52
  • @TimSchmelter I can't here for the specific query because of the information. But thanks for the suggestion. I will look into it. – ferics2 Oct 17 '12 at 21:59
  • You need to alias `count(min)` in the second query or `where count > 500` isn't valid. [Related question here](http://stackoverflow.com/questions/12876873/is-there-a-standard-for-sql-aggregate-function-calculation/12877051#12877051). You might also be interested in [HAVING A Blunderful Time or Wish You Were WHERE](http://www.dcs.warwick.ac.uk/~hugh/TTM/HAVING-A-Blunderful-Time.html) – Martin Smith Oct 17 '12 at 22:02
  • @MartinSmith: He needs also an alias for the sub query , at least in SQL-Server. So i assume that it's a different dbms. – Tim Schmelter Oct 17 '12 at 22:04
  • @TimSchmelter - In Oracle IIRC you don't though and they haven't specified DBMS. – Martin Smith Oct 17 '12 at 22:05
  • @MartinSmith, I am aware of both of those. This is not the exact query but just quickly written. I am not looking for syntax checks here but an answer to the question of how the queries perform. – ferics2 Oct 17 '12 at 22:08
  • 1
    You seem to have omitted some critical information required for performance questions. What specific DBMS are you asking about? Because it comes down to looking at the implementation in that DBMS. – Martin Smith Oct 17 '12 at 22:09

1 Answers1

14

The two versions are functionally the same. Well, the second is syntactically incorrect, but I assume you mean:

select * 
from (
    select team, count(min) as count
    from table
    group by team
) t
where count > 500

(You need the alias on the calculation and several leading databases require an alias on a subquery in a FROM clause.)

Being functionally equivalent does not mean that they are necessarily optimized the same way. There are often multiple ways to write a query that are functionally equivalent. However, the specific database engine/optimizer can choose (and often does choose) different optimization paths.

In this case, the query is so simple that it is hard to think of multiple optimization paths. For both versions, the engine basically has to aggregate teh query and then test the second column for the filter. I personally cannot see many variations on this theme. Any decent SQL engine should use indexes, if appropriate, in either both cases or neither.

So, the anwer to this specific question is that in any reasonable database, these should result in the same execution plan (i.e., in the use of indexes, the user of parallelism, and the choice of aggregation algorithm). However, being functionally equivalent does not mean that a given database engine is going to produce the same exeuction plan. So, the general answer is "no".

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786