1

The usual SQL logical processing order is:

FROM
ON
JOIN
WHERE
GROUP BY
WITH CUBE or WITH ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP

Where does OVER clause fall in the SQL logical processing order? I am trying to understand logically whether the OVER happens after the data is grouped (that is - after HAVING and before SELECT). I am confused whether DISTINCT, ORDER BY and TOP have any impact on data window used by the OVER clause.

Reference: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-transact-sql?view=sql-server-ver15#logical-processing-order-of-the-select-statement

variable
  • 8,262
  • 9
  • 95
  • 215
  • 1
    `OVER` is part of an expression, such as in a `SUM` or `LAG`, it doesn't have it's own order of execution, as `OVER` on it's own isn't part of a query. A `SUM` or `LAG` would be run at what ever point it is declared in (which is likely either the `HAVING` or `SELECT` portion of the query). What you're asking is akin to "Where does `CASE` fall in the order of execution" or "Where does `CONVERT` fall in the order of execution?" – Thom A May 13 '20 at 16:47
  • 1
    @Larnu I think the question is when the **windowed/analytical functions** are executed in general and the implications of it - i.e. when you could use the result,nest filter, and so on. In my opinion it is a good question - it help to understand why you cannot use the result in WHERE, why you could use aggegations inside windowed functions and how to filter using QUALIFY clause(vendor extension) – Lukasz Szozda May 13 '20 at 17:58

2 Answers2

2

A Beginner’s Guide to the True Order of SQL Operations by Lukas Eder:

The logical order of operations is the following (for “simplicity” I’m leaving out vendor specific things like CONNECT BY, MODEL, MATCH_RECOGNIZE, PIVOT, UNPIVOT and all the others):

FROM: ...
WHERE: ...
GROUP BY: ...
HAVING: … 

WINDOW: 
If you’re using the awesome window function feature, this is the step where they’re all calculated. Only now. 
And the cool thing is, because we have already calculated (logically!) all the aggregate functions, we can nest aggregate functions in window functions. 
It’s thus perfectly fine to write things like sum(count(*)) OVER () or row_number() OVER (ORDER BY count(*)). 
Window functions being logically calculated only now also explains why you can put them only in the SELECT or ORDER BY clauses.
They’re not available to the WHERE clause, which happened before.

SELECT: ....
DISTINCT: ...
UNION, INTERSECT, EXCEPT: ...
ORDER BY: ....
OFFSET: .
LIMIT, FETCH, TOP: ...

Related:

Why no windowed functions in where clauses?

Snowflake - QUALIFY

The QUALIFY clause filters the results of window functions. QUALIFY does with window functions what HAVING does with aggregate functions and GROUP BY clauses.

Lukasz Szozda
  • 162,964
  • 23
  • 234
  • 275
0

You are confusing expressions with clauses.

Although called a "clause" in SQL Server documentation, OVER is part of an analytic function. It is an expression that returns a scalar results.

Analytic functions can appear in the SELECT clause and ORDER BY clause. They are parsed as part of those clauses.

SQL, by the way, is a descriptive language not a procedural language. A query does not specify the "order of execution". That is determined by the compiler and optimizer. What you are referring to is the "order of parsing", which explains how identifiers are resolved in the query.

The confusion I think is usually traced to this reference. The documentation is quite clear that this refers to the "logical processing order", ("This order determines when the objects defined in one step are made available to the clauses in subsequent steps.") But people seem confused anyway.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 2
    It makes sense to consider logical order of execution to realise that the window will be on the result set after GROUP BY and HAVING is applied for example – Martin Smith May 13 '20 at 16:56
  • I have now updated the question to say "logical processing order" – variable May 13 '20 at 17:23
  • @variable . . . It doesn't change the answer. The `over` is a clause within an expression (it is a shame they have the same name). The "logical processing order" is based on the clause within the `select` statement. – Gordon Linoff May 13 '20 at 18:01
  • @MartinSmith . . . The `SELECT` (and `ORDER BY`) *is* "logically processed" after `GROUP BY`, so this is consistent with it being processed as part of the clause it is in. – Gordon Linoff May 13 '20 at 18:07
  • 2
    Yes I know that. But unfortunately this answer does not seem to focus on the key issues of logical processing order at all. It talks about parsing and resolution of identifiers and relegates the key issue as a footnote under "source of confusion" – Martin Smith May 13 '20 at 18:11
  • @MartinSmith . . . The question has been edited (at least) twice after I answered. I brought up logical processing order, not the OP. – Gordon Linoff May 13 '20 at 18:25
  • 3
    I brought it up in the initial comment in the thread above - before any edits as IMO it was clearly more relevant than the treatment given in this answer (which is somewhat dismissive with `What you are referring to is the "order of parsing", which explains how identifiers are resolved in the query`) – Martin Smith May 13 '20 at 18:29