13

Can I modify the next to use the column aliases avg_time and cnt in an expression ROUND(avg_time * cnt, 2)?

SELECT 
    COALESCE(ROUND(stddev_samp(time), 2), 0) as stddev_time, 
    MAX(time) as max_time, 
    ROUND(AVG(time), 2) as avg_time, 
    MIN(time) as min_time, 
    COUNT(path) as cnt, 
    ROUND(avg_time * cnt, 2) as slowdown, path
FROM 
    loadtime
GROUP BY
    path
ORDER BY
    avg_time DESC
LIMIT 10;

It raises the next error:

ERROR:  column "avg_time" does not exist
LINE 7:  ROUND(avg_time * cnt, 2) as slowdown, path

The next, however, works fine (use primary expressions instead of column aliases:

SELECT 
    COALESCE(ROUND(stddev_samp(time), 2), 0) as stddev_time, 
    MAX(time) as max_time, 
    ROUND(AVG(time), 2) as avg_time, 
    MIN(time) as min_time, 
    COUNT(path) as cnt, 
    ROUND(AVG(time) * COUNT(path), 2) as slowdown, path
FROM 
    loadtime
GROUP BY
    path
ORDER BY
    avg_time DESC
LIMIT 10;
sergzach
  • 6,578
  • 7
  • 46
  • 84

4 Answers4

12

You can use a previously created alias in the GROUP BY or HAVING statement but not in a SELECT or WHERE statement. This is because the program processes all of the SELECT statement at the same time and doesn't know the alias' value yet.

The solution is to encapsulate the query in a subquery and then the alias is available outside.

SELECT stddev_time, max_time, avg_time, min_time, cnt, 
       ROUND(avg_time * cnt, 2) as slowdown
FROM (
        SELECT 
            COALESCE(ROUND(stddev_samp(time), 2), 0) as stddev_time, 
            MAX(time) as max_time, 
            ROUND(AVG(time), 2) as avg_time, 
            MIN(time) as min_time, 
            COUNT(path) as cnt, 
            path
        FROM 
            loadtime
        GROUP BY
            path
        ORDER BY
            avg_time DESC
        LIMIT 10
   ) X;
Ratan Uday Kumar
  • 5,738
  • 6
  • 35
  • 54
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
9

The order of execution of a query (and thus the evaluation of expressions and aliases) is NOT the same as the way it is written. The "general" position is that the clauses are evaluated in this sequence:

FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY

Hence the column aliases are unknown to most of the query until the select clause is complete (and this is why you can use aliases in the ORDER BY clause). However table aliases which are established in the from clause are understood in the where to order by clauses.

The most common workaround is to encapsulate your query into a "derived table"

Suggested reading: Order Of Execution of the SQL query

Note: different SQL dbms have different specific rules regarding use of aliases

EDIT The purpose behind reminding readers of the logical clause sequence is that often (but not always) aliases only becomes referable AFTER the clause where the alias is declared. The most common of which is that aliases declared in the SELECT clause can be used by the ORDER BY clause. In particular, an alias declared in a SELECT clause cannot be referenced within the same SELECT clause.

But please do note that due to differences in products not every dbms will behave in this manner

Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51
  • where you get that sequence? I think `GROUP BY` and `HAVING` go after `SELECT` – Juan Carlos Oropeza Jan 22 '16 at 21:16
  • It is the ""logical operation" of SQL query. When you purchase nails to build something, the first thing you must do is GO to the shop THEN select the nails i.e. FROM the_shop SELECT the_nails. It's the same in SQL you go to the tables & join them (FROM), choose which rows (WHERE), GROUP iif needed, choose from those groups (HAVING) then now we have the right data (SELECT) we can ORDER it. – Paul Maxwell Jan 22 '16 at 21:27
  • after double check in SQL Server, Oracle **CANT** use alias in `GROUP BY` Postgres **CAN** and MySQL go rogue and **CAN** use alias in the `SELECT` – Juan Carlos Oropeza Jan 22 '16 at 21:28
  • as I said it does differ between dbms's e.g. I know terradata does allow use of column aliases earlier than the order by; however the generic approach is to assume you cannot. You can for example use an *apply* operator in mssql to get a column alias usable throughout a query - but that is not generic at all. – Paul Maxwell Jan 22 '16 at 21:33
  • this does not explain why the alias is not recognized __within the SELECT statement itself__ as the OP is asking – johnDanger Nov 13 '19 at 23:55
  • 1
    @johnDanger thanks for your comment. My reasoning was implied but not stated, so I have attempted to make it more explicit (see the edit). I trust this helps. – Paul Maxwell Nov 18 '19 at 06:20
3

Aliases are not available until the virtual relation is actually created, if you want to do additional expressions using the aliases themselves you will have to create the virtual relation using as sub-query than run an additional query on top of it. So I would modify your query to the following:

SELECT stddev_time, max_time, avg_time, min_time, ROUND(avg_time * cnt, 2) as slowdown, path FROM
(
SELECT 
    COALESCE(ROUND(stddev_samp(time), 2), 0) as stddev_time, 
    MAX(time) as max_time, 
    ROUND(AVG(time), 2) as avg_time, 
    MIN(time) as min_time, 
    COUNT(path) as cnt, 
    ROUND(AVG(time) * COUNT(path), 2) as slowdown, path
FROM 
    loadtime
GROUP BY
    path
ORDER BY
    avg_time DESC
LIMIT 10;
)

I want to add here the reason your second query worked is because the query planner recognized those columns as defined directly in the table you're querying them from.

unxn3rd
  • 257
  • 2
  • 5
2

Either repeat the expressions:

ROUND(ROUND(AVG(time), 2) * COUNT(path), 2) as slowdown

or use an subquery:

SELECT *, ROUND(avg_time * cnt, 2) as slowdown FROM (
  SELECT 
    COALESCE(ROUND(stddev_samp(time), 2), 0) as stddev_time, 
    MAX(time) as max_time, 
    ROUND(AVG(time), 2) as avg_time, 
    MIN(time) as min_time, 
    COUNT(path) as cnt, 
    path
  FROM loadtime
  GROUP BY path) x
ORDER BY avg_time DESC
LIMIT 10;
Bohemian
  • 412,405
  • 93
  • 575
  • 722