demo:db<>fiddle
Solution with row_number
window function (https://www.postgresql.org/docs/current/static/tutorial-window.html)
SELECT
customer, order_date, order_total
FROM (
SELECT
*,
first_value(order_date) OVER w as last_order,
first_value(order_total) OVER w as last_total,
row_number() OVER w as row_count
FROM orders
WINDOW w AS (PARTITION BY customer ORDER BY order_date DESC)
) s
WHERE row_count = 1 AND order_date < CURRENT_DATE - 30
Solution with DISTINCT ON
(https://www.postgresql.org/docs/9.5/static/sql-select.html#SQL-DISTINCT):
SELECT
customer, order_date, order_total
FROM (
SELECT DISTINCT ON (customer)
*,
first_value(order_date) OVER w as last_order,
first_value(order_total) OVER w as last_total
FROM orders
WINDOW w AS (PARTITION BY customer ORDER BY order_date DESC)
ORDER BY customer, order_date DESC
) s
WHERE order_date < CURRENT_DATE - 30
Explanation:
In both solutions I am working with the first_value
window function. The window function's frame is defined by customers. The rows within the customers' groups are ordered descending by date which gives the latest row first (last_value
is not working as expected every time). So it is possible to get the last order_date
and the last order_total
of this order.
The difference between both solutions is the filtering. I showed both versions because sometimes one of them is significantly faster
The window function style is creating a row count within the frames. Every first row can be filtered later. This is done by adding a row_number
window function. The benefit of this solution comes out when you are trying to filter the first two or three data sets. You simply have to change the filter from WHERE row_count = 1
to WHERE row_count = 2
But if you want only one single row per group you just need to ensure that the expected row per group is ordered to be the first row in the group. Then the DISTINCT ON
function can delete all following rows. DISTINCT ON (customer)
gives the first (ordered) row per customer
group.