1

I'm having a slight issue. I have a PostgreSQL table with such format

time (datetime)     | players (int) | servers (int)
---------------------------------------------------
2013-12-06 13:40:01 | 80            | 20
2013-12-06 13:41:13 | 78            | 21
etc.

I would like to group them by 5 minute periods and get an average of the group as a single value, so there will be 20% of the records, each containing an average of ~5 numbers, with time set to the first time value in the group. I have no idea how to do this in PgSQL. So the result would be:

2013-12-06 13:40:01 | avg of players on :40, :41, :42, :43, :44 | same with servers
2013-12-06 13:45:05 | avg of players on :45, :46, :47, :48, :49 | same with servers
2013-12-06 13:50:09 | avg of players on :50, :51, :52, :53, :54 | same with servers
2013-12-06 13:55:12 | avg of players on :55, :56, :57, :58, :59 | same with servers
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
Piotr Zduniak
  • 97
  • 2
  • 8

3 Answers3

7
SELECT grid.t5
      ,min(t."time") AS min_time
--    ,array_agg(extract(min FROM t."time")) AS 'players_on' -- optional
      ,avg(t.players) AS avg_players
      ,avg(t.servers) AS avg_servers
FROM (
   SELECT generate_series(min("time")
                         ,max("time"), interval '5 min') AS t5
   FROM tbl
   ) grid
LEFT JOIN tbl t ON t."time" >= grid.t5
               AND t."time" <  grid.t5 +  interval '5 min'
GROUP  BY grid.t5
ORDER  BY grid.t5;

Explain

  • The subquery grid produces one row for every 5 minutes from the minimum to the maximum of "time" in your table.

  • LEFT JOIN back to the table slicing data in 5-min intervals. Carefully include lower border and exclude upper border.

  • To drop 5-min-slots where nothing happened, use JOIN in place of LEFT JOIN.

  • To have your grid-times start at 0:00, 5:00 etc, round down the min("time") in generate_series().

More explanation in these related answers:
Group by data intervals
PostgreSQL: running count of rows for a query 'by minute'

Aside: I wouldn't use time as identifier. It's a reserved word in standard SQL and a function / type name in Postgres.

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
4

Try this, it should group minutes 0-4, 5-9, 10-14 and so on...

SELECT MIN(time), AVG(Players), AVG(Servers)
FROM MyTable t
GROUP BY date_trunc('hour', time),
    FLOOR(datepart('minute', time)/12)

EDIT: Changed the grouping to hour first and then to the Floor of minutes. I Think this should work.

user2989408
  • 3,127
  • 1
  • 17
  • 15
  • 3
    Produces one row per minute. – Erwin Brandstetter Dec 06 '13 at 17:17
  • Update: Works, but now you need to wrap the first column in an aggregate function. – Erwin Brandstetter Dec 06 '13 at 17:31
  • @ErwinBrandstetter With my previous query I overlooked that I was grouping until the minute component of time. Thanks for correcting me. Can't we just `SELECT` MIN(time) and remove the `date_trunc` in SELECT, though your answer would be a more efficient way to achieve this. – user2989408 Dec 06 '13 at 17:34
  • Yes, min(time) as 1st column works. Produces a similar result like my query (with a JOIN instead of a LEFT JOIN). I'd expect this one to be slower, though, due to more calculations per row. Still, +1. – Erwin Brandstetter Dec 06 '13 at 17:38
  • @ErwinBrandstetter Thanks for reaffirming that though it may be inefficient, it would work. – user2989408 Dec 06 '13 at 17:42
1

How about this?

select datepart('year', time) as StartYear, datepart('month', time) as StartMonth,
    datepart('day', time) as StartDay, datepart('hour', time) as StartHour,
    floor(datepart('minute', time)/5)*5 as StartMinute,
    avg(case when datepart('minute', time) = floor(datepart('minute', time)/5)*5 then players else null end) as Zero,
    avg(case when datepart('minute', time) = floor(datepart('minute', time)/5)*5+1 then players else null end) as One,
    avg(case when datepart('minute', time) = floor(datepart('minute', time)/5)*5+2 then players else null end) as Two,
    avg(case when datepart('minute', time) = floor(datepart('minute', time)/5)*5+3 then players else null end) as Three,
    avg(case when datepart('minute', time) = floor(datepart('minute', time)/5)*5+4 then players else null end) as Four,
from MyTable
group by datepart('year', time), datepart('month', time),
    datepart('day', time), datepart('hour', time),
    floor(datepart('minute', time)/5)*5
John Chrysostom
  • 3,973
  • 1
  • 34
  • 50