Step 1: Release the handbrakes
... if a bit slow
SELECT to_char(MIN(ts)::timestamptz, 'YYYY-MM-DD HH24:MI:SS TZ') AS min_time
,SUM(CASE WHEN sensor_id = 572 THEN value ELSE 0.0 END) AS nickname1
,SUM(CASE WHEN sensor_id = 542 THEN value ELSE 0.0 END) AS nickname2
,SUM(CASE WHEN sensor_id = 571 THEN value ELSE 0.0 END) AS nickname3
FROM sensor_values
-- LEFT JOIN sensor_values_cleaned s2 USING (sensor_id, ts)
WHERE ts >= '2013-10-14T00:00:00+00:00'::timestamptz::timestamp
AND ts < '2013-10-18T00:00:00+00:00'::timestamptz::timestamp
AND sensor_id IN (572, 542, 571, 540, 541, 573)
GROUP BY ts::date AS day
ORDER BY 1;
Major points
Replace reserved words (in standard SQL) in your identifiers.
timestamp
-> ts
time
-> min_time
Since the join is on identical column names you can use the simpler USING
clause in the join condition: USING (sensor_id, ts)
However, since the second table sensor_values_cleaned
is 100% irrelevant to this query, I removed it entirely.
As @joop already advised, switch min()
and to_char()
in your first out put column. This way, Postgres can determine the minimum from the original column value, which is generally faster and may be able to utilize an index. In this specific case, ordering by date
is also cheaper than ordering by a text
, which would also have to consider collation rules.
A similar consideration applies to your WHERE
condition:
WHERE ts::timestamptz >= '2013-10-14T00:00:00+00:00'::timestamptz
WHERE ts >= '2013-10-14T00:00:00+00:00'::timestamptz::timestamp
The second one is sargable and can utilize a plain index on ts
- to great effect on performance in big tables!
Using ts::date
instead of date_trunc('day', ts)
. Simpler, faster, same result.
Most probably your second WHERE condition is slightly incorrect. Generally, you would exclude the upper border:
AND ts <= '2013-10-18T00:00:00+00:00' ...
AND ts < '2013-10-18T00:00:00+00:00' ...
When mixing timestamp
and timestamptz
one needs to be aware of the effects. For instance, your WHERE
condition doesn't cut at 00:00 local time (except if local time coincides with UTC). Details here:
Ignoring timezones altogether in Rails and PostgreSQL
Step 2: Your request
...the difference between the latest and earliest timestamps in each grouping
And by that I suppose you mean:
...the difference between the value of the latest and earliest timestamps ...
Otherwise it would be much simpler.
Use window functions for that, in particular first_value()
and last_value()
. Careful with the combination, you want a non-standard window frame for last_value() in this case. Compare:
PostgreSQL aggregate or window function to return just the last value
I combine this with DISTINCT ON
, which is more convenient in this case than GROUP BY
(which would need another subquery level):
SELECT DISTINCT ON (ts::date, sensor_id)
ts::date AS day
,to_char((min(ts) OVER (PARTITION BY ts::date))::timestamptz
,'YYYY-MM-DD HH24:MI:SS TZ') AS min_time
,sensor_id
,last_value(value) OVER (PARTITION BY ts::date, sensor_id ORDER BY ts
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
- first_value(value) OVER (PARTITION BY ts::date, sensor_id ORDER BY ts)
AS val_range
FROM sensor_values
WHERE ts >= '2013-10-14T00:00:00+0'::timestamptz::timestamp
AND ts < '2013-10-18T00:00:00+0'::timestamptz::timestamp
AND sensor_id IN (540, 541, 542, 571, 572, 573)
ORDER BY ts::date, sensor_id;
-> SQLfiddle demo.
Step 3: Pivot table
Building on the query above I use crosstab()
from the additional module tablefunc
:
SELECT * FROM crosstab(
$$SELECT DISTINCT ON (1,3)
ts::date AS day
,to_char((min(ts) OVER (PARTITION BY ts::date))::timestamptz,'YYYY-MM-DD HH24:MI:SS TZ') AS min_time
,sensor_id
,last_value(value) OVER (PARTITION BY ts::date, sensor_id ORDER BY ts RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
- first_value(value) OVER (PARTITION BY ts::date, sensor_id ORDER BY ts) AS val_range
FROM sensor_values
WHERE ts >= '2013-10-14T00:00:00+0'::timestamptz::timestamp
AND ts < '2013-10-18T00:00:00+0'::timestamptz::timestamp
AND sensor_id IN (540, 541, 542, 571, 572, 573)
ORDER BY 1, 3$$
,$$VALUES (540), (541), (542), (571), (572), (573)$$
)
AS ct (day date, min_time text, s540 numeric, s541 numeric, s542 numeric, s571 numeric, s572 numeric, s573 numeric);
Returns (and much faster than before):
day | min_time | s540 | s541 | s542 | s571 | s572 | s573
------------+--------------------------+-------+-------+-------+-------+-------+-------
2013-10-14 | 2013-10-14 03:00:00 CEST | 18.82 | 18.98 | 19.97 | 19.47 | 17.56 | 21.27
2013-10-15 | 2013-10-15 00:15:00 CEST | 22.59 | 24.20 | 22.90 | 21.27 | 22.75 | 22.23
2013-10-16 | 2013-10-16 00:16:00 CEST | 23.74 | 22.52 | 22.23 | 23.22 | 23.03 | 22.98
2013-10-17 | 2013-10-17 00:17:00 CEST | 21.68 | 24.54 | 21.15 | 23.58 | 23.04 | 21.94