1

Let's say I have several columns, one of which has the datetimes below:

2015-07-19 17:00:01
2015-07-19 17:15:01
2015-07-19 17:30:01
2015-07-19 17:45:01
2015-07-19 18:00:01
2015-07-19 18:15:01
2015-07-19 18:30:01
2015-07-19 18:45:01
2015-07-19 19:00:01
2015-07-19 19:15:01
2015-07-19 19:30:01
2015-07-19 19:45:01
2015-07-19 20:00:01
2015-07-19 20:15:01

I want the output to cycle every 3 hours for aggregation purposes:

2015-07-19 17:00:01, max(column B), etc
2015-07-19 20:00:01, max(column B), etc
2015-07-19 23:00:01, max(column B), etc
2015-07-20 02:00:01, max(column B), etc

My attempt:

SELECT
datetime_col,
min(col_b)
FROM table
where datetime_col >= STR_TO_DATE('2015-07-19 17:00:01','%Y-%m-%d %H:%i:%s')
GROUP BY
YEAR(datetime_col),
MONTH(datetime_col),
DAY(datetime_col),
ROUND(HOUR(datetime_col)/3);

Actual output:

2015-07-19 17:00:01
2015-07-19 20:00:01
2015-07-19 23:00:01
2015-07-20 00:00:01
2015-07-20 02:00:01

You can see that at first the grouping looks good until it reaches a new day. I need to group it on a 3 hour cycle regardless of day, month, year, etc.

Preferably I want to do this in a single query since I call it in a C# application;

somethingstrang
  • 1,079
  • 2
  • 14
  • 29

3 Answers3

0

The easy way is create a table with ranges for dates and hours.

Here is an example with day, you will need expand with hours.

generate days from date range

So you will have your hour_block table

id   startTime            endTime
 1   2015-07-19 00:00:01  2015-07-19 03:00:00
 2   2015-07-19 03:00:01  2015-07-19 06:00:00
 3   2015-07-19 06:00:01  2015-07-19 09:00:00
 ...         ...                ...
 8   2015-07-19 21:00:01  2015-07-20 00:00:00

Then you create a join

 SELECT H.id, H.startTime, max(Y.column B)
 From YourTable Y
 JOIN hour_block H
   ON Y.datetime_col between H.startTime and H.endTime
 GROUP BY H.id, H.startTime
Community
  • 1
  • 1
Juan Carlos Oropeza
  • 47,252
  • 12
  • 78
  • 118
  • JOIN-ing on range conditions can't use indexes. – Vatev Nov 13 '15 at 16:19
  • @Vatev Can you share some documentation to describe that fact? – Juan Carlos Oropeza Nov 13 '15 at 16:20
  • Facts don't need documentation :) In this case it's actually worse than the range condition problem, because the join uses 2 different columns and even if it were possible to join on a range it will only one of them (which in this case is as bad as no index). – Vatev Nov 13 '15 at 16:24
  • @Vatev Yes, they need one, If you dont provide a proof is just an hypothesis. – Juan Carlos Oropeza Nov 13 '15 at 17:19
  • Think of it as a warning. Anyone who decides to use this should check the query plan and consider the performance impact. My goal is not to prove you wrong or anything of that sort. – Vatev Nov 13 '15 at 17:22
  • @Vatev thanks for your comment I will try do some test on my own for this case and let you know. – Juan Carlos Oropeza Nov 13 '15 at 17:31
  • @Vatev Check this sample [**SqlFidle**](http://sqlfiddle.com/#!9/58318/7) execution plan and how is using the index for range vs same table without index. – Juan Carlos Oropeza Nov 13 '15 at 21:44
  • [type=index](https://dev.mysql.com/doc/refman/5.7/en/explain-output.html#jointype_index) means it is scanning the entire index, not using it for searching. Which is pretty much the same as scanning the entire table. – Vatev Nov 14 '15 at 00:04
0

I would handle this by converting the value to the nearest previous 3-hour period. This is easy enough using to_seconds(), but the conversion back isn't allowed. So:

SELECT FROM_UNIXTIME(FLOOR(UNIX_TIMESTAMP(datetimecol) / (60 * 60 / 3) )) as datetime,
      min(col_b)
FROM table
WHERE datetime_col >= STR_TO_DATE('2015-07-19 17:00:01', '%Y-%m-%d %H:%i:%s')
GROUP BY FLOOR(UNIX_TIMESTAMP(datetimecol) / (60 * 60 / 3) )
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Assuming you have the 3 hours interval

CREATE TABLE T
(dcol datetime,b INT);

INSERT INTO t
VALUES('2015-07-19 17:00:01',23),
('2015-07-19 18:30:01',25),
('2015-07-19 19:00:01',66),
('2015-07-19 20:00:01',99),
('2015-07-19 21:00:01',5),
('2015-07-19 23:00:01',2),
('2015-07-20 02:00:01',78),
('2015-07-20 03:00:01',9),
('2015-07-20 05:00:01',11),
('2015-07-20 07:00:01',29)

SELECT t.dcol,t.b
FROM t JOIN t tt WHERE t.dcol=tt.dcol+INTERVAL 3 HOUR
UNION
SELECT tt.dcol,tt.b
FROM t JOIN t tt WHERE t.dcol=tt.dcol+INTERVAL 3 HOUR
ORDER BY dcol

dcol                 b
2015-07-19 17:00:01  23
2015-07-19 20:00:01  99
2015-07-19 23:00:01  2
2015-07-20 02:00:01  78
2015-07-20 05:00:01  11
Mihai
  • 26,325
  • 7
  • 66
  • 81