1

I have a table called historic:

create table historic
(
    id serial not null
        constraint table_name_pkey
            primary key,
    film_name varchar,
    category varchar,
    time_utc timestamp
)
;

create unique index table_name_id_uindex
    on historic (id)
;

I also have another table with measurements data:

create table measurements
(
    id serial not null
        constraint measurements_pkey
            primary key,
    historic_rowid integer not null
        constraint measurements_historic_id_fk
            references historic,
    measurement double precision
)
;

create unique index measurements_id_uindex
    on measurements (id)
;

As you can see, the table measurements contains a foreign key historic_rowid to the historic table (on the rowid).

I need to choose a category, say sci-fi. I want to then, from measurements select all records matching the sci-fi category and include their time:

SELECT h.film_name, h.category, m.measurement, h.time_utc
FROM historic h
LEFT JOIN measurements m on m.historic_rowid == h.id
WHERE h.category = 'sci-fi';

The result would be a table with the following columns:

film_name, category, measurement, time_utc

Now, I want to average this data every 15 minutes. In other words, I want to "bin" my data into 15 minute intervals and then for each "bin", get the average.

My final result will look something like this:

film_name, category, measurement, time_window
---------------------------------------------
film_a,    sci-fi,    0.234234,    0_to_15
film_b,    sci-fi,    0.692859,    15_to_30
film_c,    sci-fi,    0.875854,    30_to_45
film_d,    sci-fi,    0.583465,    45_to_60
film_e,    sci-fi,    0.265334,    60_to_75
film_f,    sci-fi,    0.152545,    75_to_90
....

How can I do this? I'm rather rubbish with SQL and could use some help.

UPDATE

As requested, here is some sample data for the time_utc field:

2017-04-18 02:31:03
2017-04-18 02:31:12
2017-04-18 02:31:27
2017-04-18 02:31:38
2017-04-18 02:31:53
2017-04-18 02:32:08
2017-04-18 02:32:17
2017-04-18 02:32:22
2017-04-18 02:32:58
2017-04-18 02:33:07
2017-04-18 02:33:12
2017-04-18 02:33:22
2017-04-18 02:33:37
2017-04-18 02:33:47
2017-04-18 02:34:32
2017-04-18 02:34:43
2017-04-18 02:34:47
2017-04-18 02:34:58
2017-04-18 02:35:02
2017-04-18 02:35:12
2017-04-18 02:35:17
2017-04-18 02:35:22
2017-04-18 02:35:32
2017-04-18 02:35:37
2017-04-18 02:35:42
2017-04-18 02:35:52
pookie
  • 3,796
  • 6
  • 49
  • 105
  • Possible duplicate of [Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds)](https://stackoverflow.com/questions/12045600/postgresql-sql-group-by-time-interval-with-arbitrary-accuracy-down-to-milli-sec) – Laposhasú Acsa Aug 21 '17 at 10:01
  • the easiest way i would say would be using generate_series with 15 minutes interval, but the question is when window starts and when ends... and is is produced from time_utc field?.. could you please give the sample for those four columns before you bin them?.. – Vao Tsun Aug 21 '17 at 10:01
  • @VaoTsun Thanks, I've updated the post. The `file_name`, `category` and `measurement` columns are basically the same as described in the example. – pookie Aug 21 '17 at 10:07

1 Answers1

0
with m15 as (select generate_series('2017-04-18 00:00:00','2017-04-18 00:00:00','15 minutes'::interval) g)
SELECT h.film_name, h.category, avg(m.measurement), g
FROM historic h
LEFT JOIN measurements m on m.historic_rowid == h.id
join m15 on m15.g > time_utc and m15.g + '15 minutes'::interval < time_utc
WHERE h.category = 'sci-fi'
group by h.film_name, h.category, g

join could need to be outer join if you want to include empty intervals of course. and you need to define the minimum and maximum for generate_series - doable with select min(time_utc) and max(time_utc)

Vao Tsun
  • 47,234
  • 13
  • 100
  • 132