4

I was surprised to find that the following works:

    SELECT name, height * weight as inchpounds
    FROM sashelp.class 
    HAVING inchpounds > 5000;

Since I thought the HAVING clause was processed strictly before the SELECT clause. Is this in fact a false assumption on my part? Or is this peculiar to SAS's implementation of SQL?

jl6
  • 6,110
  • 7
  • 35
  • 65
  • I believe it is processed after the select, but whether or not you can use an aliased column from your select statement in the HAVING clause is still database specific – Eric Petroelje Jan 02 '13 at 14:08
  • Your assumption is correct for `WHERE`, but `HAVING` and `WHERE` are different in that regard. – Lucero Jan 02 '13 at 14:08
  • 1
    `HAVING` is generally used with `GROUP BY` clauses, in which case it must be evaluated after the `SELECT`. When there is no `GROUP BY`, it's up to the query optimizer. – sybkar Jan 02 '13 at 14:13
  • Your syntax is invalid. `HAVING` can only be used together with a `GROUP BY` (at least in standard SQL). What is that *SAS* thing? –  Jan 02 '13 at 14:19
  • @a_horse That is wrong, `HAVING` without `GROUP BY` is part of the standard. Think of `SELECT COUNT(*) FROM t HAVING COUNT(*) < 20 ;` – ypercubeᵀᴹ Jan 02 '13 at 14:23
  • @ypercube: OK, I should have put it more precisely: `having` is invalid without an aggregate. The above query will fail in Oracle, PostgreSQL and SQL Server nonetheless. –  Jan 02 '13 at 14:24
  • 2
    You may be interested in a similar question I posted over at dba.SE a while back - http://dba.stackexchange.com/questions/21965/why-are-queries-parsed-in-such-a-way-that-disallows-the-use-of-column-aliases-in – Shauna Jan 02 '13 at 14:27

4 Answers4

3

Strictly speaking, and if I remember well, the SQL standard define HAVING clause to be processed before the SELECT clause. So, an alias defined in the SELECT, cannot be used in HAVING clause.

See my answer in a related question: Using 'case expression column' in where clause for the order of proccessing a SELECT statement, which is:

FROM -> WHERE -> GROUP BY -> HAVING -> SELECT -> ORDER BY 

So, in most SQL incarnations, that query will yield an error. MySQL is one exception I know that allows this deviation from the standard (and where SELECT aliases are allowed to be used in the GROUP BY clause as well).

As @a_horse_with_no_name mentioned in comments, the query would fail in most SQL products for another reason, too: HAVING without GROUP BY means there is an aggregation over all the rows and thus only aggregate functions would be allowed in both the HAVING and the SELECT clauses.


After looking at the SAS SQL documentation and examples, it seems the query you posted above is valid in SAS SQL and executed as this (in standard SQL):

SELECT name, height * weight AS inchpounds
FROM sashelp.class 
WHERE height * weight > 5000;

If I understand it correctly, you may also mix aggregate functions with non-aggregated columns, like this:

SELECT name, height, 
       MAX(height) AS max_height
FROM sashelp.class ;

which would be evaluated as (in standard SQL):

SELECT name, height, 
       MAX(height) OVER () AS max_height
FROM sashelp.class ;
Community
  • 1
  • 1
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235
1

The SQL Standard specifies the order that the clauses are interpreted, not the order that they are executed. As a matter of practice, this means that having clauses are parsed after select, so they can use variables defined in the select statement. SAS SQL varies from other dialects. You may be able to use select variables in the where clause.

SQL engines are free to actually run the query however they like. Your question, though, is not about running the query but about parsing it.

The use of a having clause without a group by is generally suspect. However, I believe it should work in any dialect. The only question is whether 0 rows are returned or 1 summary row.

That your particularly query works is surprising. In most dialects of SQL, you would get an error, because the the having clause would imply an aggregation, and the values in the select are not in aggregation functions. This only other dialect where this would work is MySQL because it has a (mis)feature called Hidden Columns. SAS SQL is a bit farther from standard SQL than other dialects.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • I'm actually surprised the statement works. At least Oracle, PostgreSQL and SQL Server will refuse the syntax shown in the question. –  Jan 02 '13 at 14:21
  • It actually returns as many rows as there are in the input table. – jl6 Jan 02 '13 at 14:25
  • @a_horse_with_no_name . . . The `having` clause is accepted with all three databases after a valid aggregation, return either 0 rows or 1 row. I just tested this on SQLFiddle. – Gordon Linoff Jan 02 '13 at 14:28
  • Yes, I should have been more precise: the `having` clause is invalid without an aggregation, not without a `group by` (but still the original SQL is invalid in that regard) –  Jan 02 '13 at 14:42
1

I cannot speak in terms of the SQL "standard" but with respect to the SAS implementation, the only difference between WHERE and HAVING clauses is "when" they are applied. WHERE clauses apply to the creation of intermediate results (building a temp table containing the columns listed in a SELECT statement) and a HAVING clause controls which rows are written to the final table.

Alternatively, one can use the CALCULATED keyword on a WHERE statement:

SELECT name, height * weight as inchpounds
FROM sashelp.class 
WHERE CALCULATED inchpounds > 5000; 
BellevueBob
  • 9,498
  • 5
  • 29
  • 56
0

Is this in fact a false assumption on my part?

It can be false. The query planner will decide how to run the query according to the data and the statistics that the database has.

Thinking about the different clauses running in specific sequence is helpful in writing and reasoning about queries but this is not how most SQL databases actually work.

Oded
  • 489,969
  • 99
  • 883
  • 1,009