In a Postgres DB I have entries for "events", associated with an id, and when they happened. I need to count them with a special rule.
When an event happens the counter is incremented and for the next 14 days all events of this type are not counted.
Example:
event | created_at | blockdate | action |
---|---|---|---|
16 | 2021-11-11 11:15 | 25.11.21 | count |
16 | 2021-11-11 11:15 | 25.11.21 | block |
16 | 2021-11-13 10:45 | 25.11.21 | block |
16 | 2021-11-16 10:40 | 25.11.21 | block |
16 | 2021-11-23 11:15 | 25.11.21 | block |
16 | 2021-11-23 11:15 | 25.11.21 | block |
16 | 2021-12-10 13:00 | 24.12.21 | count |
16 | 2021-12-15 13:25 | 24.12.21 | block |
16 | 2021-12-15 13:25 | 24.12.21 | block |
16 | 2021-12-15 13:25 | 24.12.21 | block |
16 | 2021-12-20 13:15 | 24.12.21 | block |
16 | 2021-12-23 13:15 | 24.12.21 | block |
16 | 2021-12-31 13:25 | 14.01.22 | count |
16 | 2022-02-05 15:00 | 19.02.22 | count |
16 | 2022-02-05 15:00 | 19.02.22 | block |
16 | 2022-02-13 17:15 | 19.02.22 | block |
16 | 2022-02-21 10:09 | 07.03.22 | count |
43 | 2021-11-26 11:00 | 10.12.21 | count |
43 | 2022-01-01 15:00 | 15.01.22 | count |
43 | 2022-04-13 10:07 | 27.04.22 | count |
43 | 2022-04-13 10:09 | 27.04.22 | block |
43 | 2022-04-13 10:09 | 27.04.22 | block |
43 | 2022-04-13 10:09 | 27.04.22 | block |
43 | 2022-04-13 10:10 | 27.04.22 | block |
43 | 2022-04-13 10:10 | 27.04.22 | block |
43 | 2022-04-13 10:47 | 27.04.22 | block |
43 | 2022-05-11 20:25 | 25.05.22 | count |
75 | 2021-10-21 12:50 | 04.11.21 | count |
75 | 2021-11-02 12:50 | 04.11.21 | block |
75 | 2021-11-18 11:15 | 02.12.21 | count |
75 | 2021-11-18 12:55 | 02.12.21 | block |
75 | 2021-11-18 16:35 | 02.12.21 | block |
75 | 2021-11-24 11:00 | 02.12.21 | block |
75 | 2021-12-01 11:00 | 02.12.21 | block |
75 | 2021-12-14 13:25 | 28.12.21 | count |
75 | 2021-12-15 13:35 | 28.12.21 | block |
75 | 2021-12-26 13:25 | 28.12.21 | block |
75 | 2022-01-31 15:00 | 14.02.22 | count |
75 | 2022-02-02 15:30 | 14.02.22 | block |
75 | 2022-02-03 15:00 | 14.02.22 | block |
75 | 2022-02-17 15:00 | 03.03.22 | count |
75 | 2022-02-17 15:00 | 03.03.22 | block |
75 | 2022-02-18 15:00 | 03.03.22 | block |
75 | 2022-02-23 15:00 | 03.03.22 | block |
75 | 2022-02-25 15:00 | 03.03.22 | block |
75 | 2022-03-04 10:46 | 18.03.22 | count |
75 | 2022-03-08 21:05 | 18.03.22 | block |
In Excel I simply add two columns. In one column I carry over a "blockdate", a date until when events have to be blocked. In the other column I compare the ID with the previous ID and the previous "blockdate".
When the IDs a different or the blockdate is less then the current date, I have to count. When I have to count, I set the row's blockdate to the current date + 14 days, otherwise I carry over the previous blockdate.
I tried now to solve this in Postgres with ...
- window functions
- recursive CTEs
- lateral joins
... and all seemed a bit promising, but in the end I failed to implement this tricky count.
For example, my recursive CTE failed with:
aggregate functions are not allowed in WHERE
with recursive event_count AS (
select event
, min(created_at) as created
from test
group by event
union all
( select event
, created_at as created
from test
join event_count
using(event)
where created_at >= max(created) + INTERVAL '14 days'
order by created_at
limit 1
)
)
select * from event_count
Window functions, using lag()
to access the previous row don't seem to work because they cannot access columns in the previous row which were created using the window function.
Adding a "block-or-count" information upon entering a new event entry by simply comparing with the last entry wouldn't solve the issue as event entries "go away" after about half a year. So when the first entry goes away, the next one becomes the first and the logic has to be applied on the new situation.
Above test data can be created with:
CREATE TABLE test (
event INTEGER,
created_at TIMESTAMP
);
INSERT INTO test (event, created_at) VALUES
(16, '2021-11-11 11:15'),(16, '2021-11-11 11:15'),(16, '2021-11-13 10:45'),(16, '2021-11-16 10:40'),
(16, '2021-11-23 11:15'),(16, '2021-11-23 11:15'),(16, '2021-12-10 13:00'),(16, '2021-12-15 13:25'),
(16, '2021-12-15 13:25'),(16, '2021-12-15 13:25'),(16, '2021-12-20 13:15'),(16, '2021-12-23 13:15'),
(16, '2021-12-31 13:25'),(16, '2022-02-05 15:00'),(16, '2022-02-05 15:00'),(16, '2022-02-13 17:15'),
(16, '2022-02-21 10:09'),
(43, '2021-11-26 11:00'),(43, '2022-01-01 15:00'),(43, '2022-04-13 10:07'),(43, '2022-04-13 10:09'),
(43, '2022-04-13 10:09'),(43, '2022-04-13 10:09'),(43, '2022-04-13 10:10'),(43, '2022-04-13 10:10'),
(43, '2022-04-13 10:47'),(43, '2022-05-11 20:25'),
(75, '2021-10-21 12:50'),(75, '2021-11-02 12:50'),(75, '2021-11-18 11:15'),(75, '2021-11-18 12:55'),
(75, '2021-11-18 16:35'),(75, '2021-11-24 11:00'),(75, '2021-12-01 11:00'),(75, '2021-12-14 13:25'),
(75, '2021-12-15 13:35'),(75, '2021-12-26 13:25'),(75, '2022-01-31 15:00'),(75, '2022-02-02 15:30'),
(75, '2022-02-03 15:00'),(75, '2022-02-17 15:00'),(75, '2022-02-17 15:00'),(75, '2022-02-18 15:00'),
(75, '2022-02-23 15:00'),(75, '2022-02-25 15:00'),(75, '2022-03-04 10:46'),(75, '2022-03-08 21:05');