20

In PostgreSQL 9.4 the window functions have the new option of a FILTER to select a sub-set of the window frame for processing. The documentation mentions it, but provides no sample. An online search yields some samples, including from 2ndQuadrant but all that I found were rather trivial examples with constant expressions. What I am looking for is a filter expression that includes the value of the current row.

Assume I have a table with a bunch of columns, one of which is of date type:

col1 | col2 |     dt
------------------------
  1  |  a   | 2015-07-01
  2  |  b   | 2015-07-03
  3  |  c   | 2015-07-10
  4  |  d   | 2015-07-11
  5  |  e   | 2015-07-11
  6  |  f   | 2015-07-13
...

A window definition for processing on the date over the entire table is trivially constructed: WINDOW win AS (ORDER BY dt)

I am interested in knowing how many rows are present in, say, the 4 days prior to the current row (inclusive). So I want to generate this output:

col1 | col2 |     dt     | count
--------------------------------
  1  |  a   | 2015-07-01 |   1
  2  |  b   | 2015-07-03 |   2
  3  |  c   | 2015-07-10 |   1
  4  |  d   | 2015-07-11 |   3
  5  |  e   | 2015-07-11 |   3
  6  |  f   | 2015-07-13 |   4
...

The FILTER clause of the window functions seems like the obvious choice:

count(*) FILTER (WHERE current_row.dt - dt <= 4) OVER win

But how do I specify current_row.dt (for lack of a better syntax)? Is this even possible?

If this is not possible, are there other ways of selecting date ranges in a window frame? The frame specification is no help as it is all row-based.

I am not interested in alternative solutions using sub-queries, it has to be based on window processing.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Patrick
  • 29,357
  • 6
  • 62
  • 90
  • Interesting problem. As Postgres regular you should know to provide a table definition (or better yet, a complete `CREATE TABLE` script) – Erwin Brandstetter Jul 14 '15 at 04:55
  • @ErwinBrandstetter Is this something that would merit flagging as a "feature request"? To me it seems quite valuable to be able to filter the window frame on the basis of some condition present in the current row. – Patrick Jul 16 '15 at 14:15
  • Yes, It would be useful for various problems, I have seen several related questions here on SO. I am afraid the feature would require a lot of work due to principal changes in the implementation, but it won't hurt to document public interest in the feature. Seems related to the missing `RANGE BETWEEN ... PRECEDING/FOLLOWING` feature that is already [documented in the ToDo Wiki](https://wiki.postgresql.org/wiki/Todo#Window_Functions) – Erwin Brandstetter Jul 16 '15 at 14:49

3 Answers3

6

You are not actually aggregating rows, so the new aggregate FILTER clause is not the right tool. A window function is more like it, a problem remains, however: the frame definition of a window cannot depend on values of the current row. It can only count a given number of rows preceding or following with the ROWS clause.

To make that work, aggregate counts per day and LEFT JOIN to a full set of days in range. Then you can apply a window function:

SELECT t.*, ct.ct_last4days
FROM  (
   SELECT *, sum(ct) OVER (ORDER BY dt ROWS 3 PRECEDING) AS ct_last4days
   FROM  (
      SELECT generate_series(min(dt), max(dt), interval '1 day')::date AS dt
      FROM   tbl t1
      ) d
   LEFT   JOIN (SELECT dt, count(*) AS ct FROM tbl GROUP BY 1) t USING (dt)
   ) ct
JOIN  tbl t USING (dt);

Omitting ORDER BY dt in the widow frame definition usually works, since the order is carried over from generate_series() in the subquery. But there are no guarantees in the SQL standard without explicit ORDER BY and it might break in more complex queries.

SQL Fiddle.

Related:

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • When the `FILTER` clause applies to aggregate functions in the presence of an `OVER` clause they [*act as window functions (...) otherwise they act as regular aggregates*](http://www.postgresql.org/docs/current/static/functions-window.html) so that would invalidate your opening sentence, if I understand both the documentation and you correctly. Accepted for being a working solution and meticulous attention to detail, but only reluctantly so for the dependency on sub-queries, for which you are obviously not to blame. – Patrick Jul 16 '15 at 14:24
  • @Patrick: Aggregate functions (can) fold rows, while window functions don't. You obviously don't want to fold rows, so you are in the realm of window functions. The `FILTER` clause implements some features of window functions for aggregate functions, but you need a window function anyway. – Erwin Brandstetter Jul 16 '15 at 14:43
3

I don't think there is any syntax that means "current row" in an expression. The gram.y file for postgres makes a filter clause take just an a_expr, which is just the normal expression clauses. There is nothing specific to window functions or filter clauses in an expression. As far as I can find, the only current row notion in a window clause is for specifying the window frame boundaries. I don't think this gets you what you want.

It's possible that you could get some traction from an enclosing query:

http://www.postgresql.org/docs/current/static/sql-expressions.html

When an aggregate expression appears in a subquery (see Section 4.2.11 and Section 9.22), the aggregate is normally evaluated over the rows of the subquery. But an exception occurs if the aggregate's arguments (and filter_clause if any) contain only outer-level variables: the aggregate then belongs to the nearest such outer level, and is evaluated over the rows of that query.

but it's not obvious to me how.

user464502
  • 2,203
  • 11
  • 14
0

https://www.postgresql.org/docs/release/11.0/

Window functions now support all framing options shown in the SQL:2011 standard, including RANGE distance PRECEDING/FOLLOWING, GROUPS mode, and frame exclusion options

https://dbfiddle.uk/p-TZHp7s

You can do something like

count(dt) over(order by dt RANGE BETWEEN INTERVAL '3 DAYS' PRECEDING AND CURRENT ROW)
jian
  • 4,119
  • 1
  • 17
  • 32