Assuming the following (which should be defined in the question):
- Postgres 15.
- The table is big, many rows per label, performance matters, we can add indexes.
- All columns are actually
NOT NULL
, you just forgot to declare columns as such.
- Evey "light" has a distinct
id_dec
and a distinct label
. Having both in switch_times
is redundant. (Normalization!)
- A light is "switched on" if the most recent earlier entry has
is1 IS TRUE
. Else it's considered "off".
- The order of rows is established by
ts
, not by id
as used in your query (typically incorrect).
- Consecutive entries do not have to change the state.
- No duplicate entries for
(id_dec, ts)
. (There is a unique index enforcing that.)
- There is no minimum or maximum time interval between entries.
- "The 25th" is supposed to mean
tstzrange '[2022-11-25 0:0+02, 2022-11-26 0:0+02)'
(Note the time zone offsets.)
- You want results for all labels that were switched on at all during the given time interval.
- There is a table "labels" with one distinct entry per relevant light. If you don't have one, create it.
Indexes
Have at least these indexes to make everything fast:
CREATE INDEX ON switch_times (id_dec, ts DESC);
CREATE INDEX ON switch_times (ts);
Optional step to create table labels
CREATE TABLE labels AS
WITH RECURSIVE cte AS (
(
SELECT id_dec, label
FROM switch_times
ORDER BY 1
LIMIT 1
)
UNION ALL
(
SELECT s.id_dec, s.label
FROM cte c
JOIN switch_times s ON s.id_dec > c.id_dec
ORDER BY 1
LIMIT 1
)
)
TABLE cte;
ALTER TABLE labels
ADD PRIMARY KEY (id_dec)
, ALTER COLUMN label SET NOT NULL
, ADD CONSTRAINT label_uni UNIQUE (label)
;
Why this way? See:
Main query
WITH bounds(lo, hi) AS (
SELECT timestamptz '2022-11-25 0:0+02' -- enter time interval here *once*
, timestamptz '2022-11-26 0:0+02'
)
, snapshot AS (
SELECT id_dec, label, is1, ts
FROM switch_times s, bounds b
WHERE s.ts >= b.lo
AND s.ts < b.hi
UNION ALL -- must be separate
SELECT s.*
FROM labels l
JOIN LATERAL ( -- latest earlier entry
SELECT s.id_dec, s.label, s.is1, b.lo AS ts -- cut off at lower bound
FROM switch_times s, bounds b
WHERE s.id_dec = l.id_dec
AND s.ts < b.lo
ORDER BY s.ts DESC
LIMIT 1
) s ON s.is1 -- ... if it's "on"
)
SELECT label, sum(z - a) AS duration
FROM (
SELECT label
, lag(is1, 1, false) OVER w AS last_is1
, lag(ts) OVER w AS a
, ts AS z
FROM snapshot
WINDOW w AS (PARTITION BY label ORDER BY ts ROWS UNBOUNDED PRECEDING)
) sub
WHERE last_is1
GROUP BY 1;
fiddle
CTE bounds
is an optional convenience feature to enter lower and upper bound of your time interval once.
CTE snapshot
collects all rows of interest, which consists of
- all rows inside the time interval (1st leg of
UNION ALL
query)
- the latest earlier row if it was "on" (2nd leg of
UNION ALL
query)
We need to gather 2. separately to cover corner cases where the light was switched on earlier and there is no entry for the given time interval! But we can replace the timestamp to the lower bound immediately.
The final query gets the previous (is1, ts)
for every row in a subquery, defaulting to "off" if there was no previous row.
Finally sum up intervals in the outer SELECT
. Only sum what's switched on at the begin (no matter the final state).
Related: