-1

We have

A wide-format PostgreSQL table of time series data from various sensors:

datetime s_1 ... s_n Note
20230101T10:00:00Z 3.00 ... null
20230101T11:00:00Z 2.00 ... 10.0
20230101T12:00:00Z 1.00 ... null
20230101T23:00:00Z 4.00 ... null next day in tz=Europe/Berlin!
20230102T10:00:00Z 2.00 ... null
20230102T10:00:00Z 1.00 ... 9.3

datetime is a UTC-timestamp and s_i are floats. Marked as bold are the sought-after values. In practice, n is a few 100.

We want

An SQL query (preferably one, avoiding intermediate tables) which yields a table in long format containing for each sensor s_i its maximal (non-null) value for each day in local time (Europe/Berlin, +01:00/+02:00 wrt UTC during normal/DST time):

datetime variable value
20230101T10:00:00Z s_1 3.00
20230101T23:00:00Z s_1 4.00
20230101T11:00:00Z s_n 10.00
20230102T10:00:00Z s_n 9.3

Less-optimal alternative would be the same table in wide format:

datetime s_1 ... s_n Note
20230101T10:00:00Z 3.00 ... null
20230101T11:00:00Z null ... 10.0 non-maximal value for s_1 set to null!
20230101T23:00:00Z 4.00 ... null
20230102T10:00:00Z null ... 9.3 non-maximal value for s_1 set to null!

What we tried

Actually a lot. We got closes to a solution with the query

SELECT 
    date (timezone('Europe/Berlin', datetime))::timestamptz as datetime, 
    MAX (s_1) as s_1_max, 
    MAX (s_n) as s_n_max
FROM table
GROUP BY date (timezone('Europe/Berlin', datetime))

which yields a wide-format table of this form:

datetime s_1_max ... s_n_max Note
20230101T00:00:00Z 3.00 ... 10.0 one row per day, time lost
20230102T00:00:00Z 4.00 ... 9.3 one row per day, time lost

Yet, the time information is lost (all times set to 00:00:00) here, and all sought-after maximal values are aggregated in one row.

  • Does this answer your question? [Convert a UTC timezone in postgresql to EST (local time)](https://stackoverflow.com/questions/36341060/convert-a-utc-timezone-in-postgresql-to-est-local-time) (You just need to change the timezone 'EST' to the correct one...) – Luuk Apr 27 '23 at 11:12
  • @Luuk Unfortunately, it doesn't. The question is not how to convert time zones (we know how to do it), but how to GROUP BY a date extracted from a timestamp, without loosing the time associated with that time stamp. At least considering our best attempt I showed. There may be completely different ways, though. – Christian Hoffmann Apr 27 '23 at 11:17
  • "how to GROUP BY a date extracted from a timestamp, without loosing the time" Group by date, without loosing the time is NOT possible. – Luuk Apr 27 '23 at 11:19
  • @Luuk I am aware of this. Otherwise we would have solved the problem ;) I am rather looking for an alternative which achives our aim with difference query. – Christian Hoffmann Apr 27 '23 at 11:40
  • Can you create a [DBFIDDLe](https://dbfiddle.uk/btGcOH30) with the data? – Luuk Apr 27 '23 at 11:42

2 Answers2

0

dbfiddle demo

select datetime, col, val 
from (
    select datetime, col, val, 
           rank() over (partition by dt, col order by val desc) rnk 
    from ( 
        select datetime,
               date(timezone('Europe/Berlin', datetime)) dt,
               unnest(array['s_1', 's_2']) as col,
               unnest(array[s_1, s_2]) as val
        from tbl t) unpivoted
    where val is not null ) ranked
where rnk = 1
Ponder Stibbons
  • 14,723
  • 2
  • 21
  • 24
  • Thanks a lot, your query solved the problem! Clever combination of PARTITION BY, ORDER and RANK for finding the maximum. I'll keep the "unpivot" part in mind, that we come in handy in other situations, too! – Christian Hoffmann Apr 27 '23 at 12:35
0

When doing:

select date(timezone('Europe/Berlin','20230101T10:00:00Z'))::timestamptz;

You will convert the datetime value to a date, which looses time (2023-01-01 00:00:00+01).

do this:

select (timezone('Europe/Berlin','20230101T10:00:00Z'))::timestamptz;

results in: 2023-01-01 11:00:00+01, which is the converted (date)time in Berlin-timezone.

Luuk
  • 12,245
  • 5
  • 22
  • 33
  • sure, omitting the call to date(...) retains the time. Yet as we GROUPBY date(...), that date(...) must also appear in the SELECT. So this works as one-liner, but not in the context of the last SQL query I posted. – Christian Hoffmann Apr 27 '23 at 12:31
  • @ChristianHoffmann: It's up to you to correct the SQL, and also leave the `date()` from the GROUP BY` (Which seems obvious to /me) – Luuk Apr 27 '23 at 12:37
  • Sure, I can drop it the date() from both the SELECT and the GROUP BY to avoid SQL errors. But that would compromise our ultimate aim: to find the maximum value for each sensor in each day. That is the central part of my question. – Christian Hoffmann Apr 27 '23 at 12:46
  • But you need/want to group on German days? Then you should first convert to a datetime in German, and only after that do the conversion to date (without time). – Luuk Apr 27 '23 at 12:49