3

I have a table of numbers in a Postgres database, and I'm trying to calculate the range (max-min) of the 3 preceding and 3 following rows, for each row. Given a test table:

CREATE TABLE test_table AS (
SELECT
    generate_series(1,100) id,
    random()*generate_series(1,100) val
);

This can be done with the following query:

SELECT
    id,
    MAX(val) OVER (
        ORDER BY id
        ROWS
            BETWEEN 3 PRECEDING
            AND 3 FOLLOWING
    ) - 
    MIN(val) OVER (
        ORDER BY id
        ROWS
            BETWEEN 3 PRECEDING
            AND 3 FOLLOWING
    ) val_range
FROM test_table;

But the OVER clause is duplicated. Is there a way to improve the query and remove the duplicate? Would the duplicate impact performance? How should I handle the moving window if I wanted to calculate other aggregates for each id?

I've come up with the following query to reuse the window, but I'm not sure if it's the optimal approach.

WITH
windowed AS (
SELECT
    id,
    UNNEST(ARRAY_AGG(val) OVER (
        ORDER BY id
        ROWS
            BETWEEN 3 PRECEDING
            AND 3 FOLLOWING
    )) val
FROM test_table
)
SELECT
    id,
    MAX(val)-MIN(val) val_range,
    AVG(val) val_mean,
    STDDEV(val) val_stddev
FROM windowed
GROUP BY id
ORDER BY id;
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
rovyko
  • 4,068
  • 5
  • 32
  • 44

1 Answers1

4

You can use a WINDOW clause to shorten your code:

SELECT id
     , MAX(val) OVER w - 
       MIN(val) OVER w AS val_range
FROM   test_table
WINDOW w AS (ORDER BY id ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING);

But that hardly affects performance at all. Postgres will reuse the window frame if you spell it out repeatedly as well. The manual:

When multiple window functions are used, all the window functions having syntactically equivalent PARTITION BY and ORDER BY clauses in their window definitions are guaranteed to be evaluated in a single pass over the data.

Related:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • That makes sense! Is `WINDOW a` supposed to be `WINDOW w`? – rovyko Apr 16 '19 at 03:37
  • Is it generally true that using `SELECT...WINDOW...` is only syntactic sugar? Can it even improve performance? – LondonRob Aug 09 '21 at 12:58
  • 1
    @LondonRob: Yes. The `WINDOW` clause is solely a "convenience feature" in Postgres, so you can reuse window definitions in multiple `OVER` clauses of window functions. I added a quote from the manual. – Erwin Brandstetter Aug 09 '21 at 21:21
  • Thanks, this is actually an important extra piece of information about windows: you don't need to worry about specifying the same window many times, the query planner will sort it all out and only calculate each window once! – LondonRob Aug 11 '21 at 13:15