8

I was lucky enough to find this awesome piece of code on Stack Overflow, however I wanted to change it up so it showed each half hour instead of every hour, but messing around with it, only caused me to ruin the query haha.

This is the SQL:

SELECT CONCAT(HOUR(created_at), ':00-', HOUR(created_at)+1, ':00') as hours,
       COUNT(*)
FROM urls
GROUP BY HOUR(created_at)
ORDER BY HOUR(created_at) ASC

How would I go about getting a result every half an hour? :)

Another thing, is that, if it there is half an hour with no results, I would like it to return 0 instead of just skipping that step. It looks kinda of weird win I do statistics over the query, when it just skips an hour because there were none :P

Jazerix
  • 4,729
  • 10
  • 39
  • 71

7 Answers7

16

If the format isn't too important, you can return two columns for the interval. You might even just need the start of the interval, which can be determined by:

date_format(created_at - interval minute(created_at)%30 minute, '%H:%i') as period_start

the alias can be used in GROUP BY and ORDER BY clauses. If you also need the end of the interval, you will need a small modification:

SELECT
  date_format(created_at - interval minute(created_at)%30 minute, '%H:%i') as period_start,
  date_format(created_at + interval 30-minute(created_at)%30 minute, '%H:%i') as period_end,
  COUNT(*)
FROM urls
GROUP BY period_start
ORDER BY period_start ASC;

Of course you can also concatenate the values:

SELECT concat_ws('-',
           date_format(created_at - interval minute(created_at)%30 minute, '%H:%i'),
           date_format(created_at + interval 30-minute(created_at)%30 minute, '%H:%i')
       ) as period,
       COUNT(*)
FROM urls
GROUP BY period
ORDER BY period ASC;

Demo: http://rextester.com/RPN50688

Another thing, is that, if it there is half an hour with no results, I would like it to return 0

If you use the result in a procedural language, you can initialize all 48 rows with zero in a loop and then "inject" the non-zero rows from the result.

However - If you need it to be done in SQL, you will need a table for a LEFT JOIN with at least 48 rows. That could be done inline with a "huge" UNION ALL statement, but (IMHO) it would be ugly. So I prefer to have sequence table with one integer column, which can be very usefull for reports. To create that table I usually use the information_schema.COLUMNS, since it is available on any MySQL server and has at least a couple of hundreds rows. If you need more rows - just join it with itself.

Now let's create that table:

drop table if exists helper_seq;
create table helper_seq (seq smallint auto_increment primary key)
    select null
    from information_schema.COLUMNS c1
       , information_schema.COLUMNS c2
    limit 100; -- adjust as needed

Now we have a table with integers from 1 to 100 (though right now you only need 48 - but this is for demonstration).

Using that table we can now create all 48 time intervals:

select time(0) + interval 30*(seq-1) minute as period_start,
       time(0) + interval 30*(seq)   minute as period_end
from helper_seq s
where s.seq <= 48;

We will get the following result:

period_start | period_end
    00:00:00 |   00:30:00
    00:30:00 |   01:00:00
...
   23:30:00  |   24:00:00

Demo: http://rextester.com/ISQSU31450

Now we can use it as a derived table (subquery in FROM clause) and LEFT JOIN your urls table:

select p.period_start, p.period_end, count(u.created_at) as cnt
from (
    select time(0) + interval 30*(seq-1) minute as period_start,
           time(0) + interval 30*(seq)   minute as period_end
    from helper_seq s
    where s.seq <= 48
) p
left join urls u
    on  time(u.created_at) >= p.period_start
    and time(u.created_at) <  p.period_end
group by p.period_start, p.period_end
order by p.period_start

Demo: http://rextester.com/IQYQ32927

Last step (if really needed) is to format the result. We can use CONCAT or CONCAT_WS and TIME_FORMAT in the outer select. The final query would be:

select concat_ws('-',
         time_format(p.period_start, '%H:%i'),
         time_format(p.period_end,   '%H:%i')
       ) as period,
       count(u.created_at) as cnt
from (
    select time(0) + interval 30*(seq-1) minute as period_start,
           time(0) + interval 30*(seq)   minute as period_end
    from helper_seq s
    where s.seq <= 48
) p
left join urls u
    on  time(u.created_at) >= p.period_start
    and time(u.created_at) <  p.period_end
group by p.period_start, p.period_end
order by p.period_start

The result would look like:

period      | cnt
00:00-00:30 |   1
00:30-01:00 |   0
...
23:30-24:00 |   3

Demo: http://rextester.com/LLZ41445

Paul Spiegel
  • 30,925
  • 5
  • 44
  • 53
1

You can add some math to calculate 48 intervals instead of 24 and put it into another field by which you're going to group and sort.

SELECT HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) as interval48, 
    if(HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) % 2 =0,
    CONCAT(HOUR(created_at), ':00-', HOUR(created_at), ':30'),
    CONCAT(HOUR(created_at), ':30-', HOUR(created_at)+1, ':00')
       )  as hours,
      count(*)
FROM urls
GROUP BY HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30)
ORDER BY HOUR(created_at)*2+FLOOR(MINUTE(created_at)/30) ASC

Example of result:

0   0:00-0:30   2017
1   0:30-1:00   1959
2   1:30-2:00   1830
3   1:30-2:00   1715
4   2:30-3:00   1679
5   2:30-3:00   1688

The result of original query posted by Jazerix was:

0:00-1:00 3976
1:00-2:00 3545
2:00-3:00 3367
vladatr
  • 616
  • 6
  • 15
1

Well, this could be a bit verbose but it works:

SELECT hours, SUM(count) as count FROM (
    SELECT CONCAT(HOUR(created_at), ':', LPAD(30 * FLOOR(MINUTE(created_at)/30), 2, '0'), '-',
                  HOUR(DATE_ADD(created_at, INTERVAL 30 minute)), ':', LPAD(30 * FLOOR(MINUTE(DATE_ADD(created_at, INTERVAL 30 minute))/30), 2, '0')) as hours,
           COUNT(*) as count
    FROM urls
    GROUP BY HOUR(created_at), FLOOR(MINUTE(created_at)/30)

    UNION ALL

    SELECT '00:00-00:30'as hours, 0 as count UNION ALL SELECT '00:30-01:00'as hours, 0 as count UNION ALL 
    SELECT '01:00-01:30'as hours, 0 as count UNION ALL SELECT '01:30-02:00'as hours, 0 as count UNION ALL 
    SELECT '02:00-02:30'as hours, 0 as count UNION ALL SELECT '02:30-03:00'as hours, 0 as count UNION ALL 
    SELECT '03:00-03:30'as hours, 0 as count UNION ALL SELECT '03:30-04:00'as hours, 0 as count UNION ALL 
    SELECT '04:00-04:30'as hours, 0 as count UNION ALL SELECT '04:30-05:00'as hours, 0 as count UNION ALL 
    SELECT '05:00-05:30'as hours, 0 as count UNION ALL SELECT '05:30-06:00'as hours, 0 as count UNION ALL 
    SELECT '06:00-06:30'as hours, 0 as count UNION ALL SELECT '06:30-07:00'as hours, 0 as count UNION ALL 
    SELECT '07:00-07:30'as hours, 0 as count UNION ALL SELECT '07:30-08:00'as hours, 0 as count UNION ALL 
    SELECT '08:00-08:30'as hours, 0 as count UNION ALL SELECT '08:30-09:00'as hours, 0 as count UNION ALL 
    SELECT '09:00-09:30'as hours, 0 as count UNION ALL SELECT '09:30-10:00'as hours, 0 as count UNION ALL 
    SELECT '10:00-10:30'as hours, 0 as count UNION ALL SELECT '10:30-11:00'as hours, 0 as count UNION ALL 
    SELECT '11:00-11:30'as hours, 0 as count UNION ALL SELECT '11:30-12:00'as hours, 0 as count UNION ALL 
    SELECT '12:00-12:30'as hours, 0 as count UNION ALL SELECT '12:30-13:00'as hours, 0 as count UNION ALL 
    SELECT '13:00-13:30'as hours, 0 as count UNION ALL SELECT '13:30-14:00'as hours, 0 as count UNION ALL 
    SELECT '14:00-14:30'as hours, 0 as count UNION ALL SELECT '14:30-15:00'as hours, 0 as count UNION ALL 
    SELECT '15:00-15:30'as hours, 0 as count UNION ALL SELECT '15:30-16:00'as hours, 0 as count UNION ALL 
    SELECT '16:00-16:30'as hours, 0 as count UNION ALL SELECT '16:30-17:00'as hours, 0 as count UNION ALL 
    SELECT '17:00-17:30'as hours, 0 as count UNION ALL SELECT '17:30-18:00'as hours, 0 as count UNION ALL 
    SELECT '18:00-18:30'as hours, 0 as count UNION ALL SELECT '18:30-19:00'as hours, 0 as count UNION ALL 
    SELECT '19:00-19:30'as hours, 0 as count UNION ALL SELECT '19:30-20:00'as hours, 0 as count UNION ALL 
    SELECT '20:00-20:30'as hours, 0 as count UNION ALL SELECT '20:30-21:00'as hours, 0 as count UNION ALL 
    SELECT '21:00-21:30'as hours, 0 as count UNION ALL SELECT '21:30-22:00'as hours, 0 as count UNION ALL 
    SELECT '22:00-22:30'as hours, 0 as count UNION ALL SELECT '22:30-23:00'as hours, 0 as count UNION ALL 
    SELECT '23:00-23:30'as hours, 0 as count UNION ALL SELECT '23:30-00:00'as hours, 0 as count 

) AS T
GROUP BY hours ORDER BY hours;

The most difficult part of your query is output of statistics for intervals that don't have any hits. SQL is all about querying and aggregating existing data; selecting or aggregating the data missing in the table is quite unordinary task. That's why, like Wolph stated in comments, there is no pretty solution for this task.

I solved this problem by explicitly selecting all half intervals of the day. This solution could be used if number of intervals is limited like in your case. This will not work however if you aggregate by different days from long period of time.

I'm not a fan of this query but I can't propose anything better. More elegant solution could be achieved with stored procedure with a loop, but seems like you want to solve it with raw SQL query.

CodeFuller
  • 30,317
  • 3
  • 63
  • 79
1
  1. Switch to seconds.
  2. Do arithmetic to get a number for each unit of time (using 30*60 for half-hour, in your case)
  3. Have a table of consecutive numbers.
  4. Use LEFT JOIN to get even missing units of time.
  5. Do the GROUP BY.
  6. Convert back from units of time to actual time -- for display.

(Steps 3 and 4 are optional. The question says "every", so I assume they are needed.)

Steps 1 and 2 are embodied in something like

FLOOR(UNIX_TIMESTAMP(created_at) / (30*60))

For example:

mysql> SELECT NOW(), FLOOR(UNIX_TIMESTAMP(NOW()) / (30*60));
+---------------------+----------------------------------------+
| NOW()               | FLOOR(UNIX_TIMESTAMP(NOW()) / (30*60)) |
+---------------------+----------------------------------------+
| 2018-03-02 08:24:48 |                                 844448 |
+---------------------+----------------------------------------+

Step 3 is needs to be done once and kept in a permanent table. Or, if you have MariaDB, use a "seq" pseudo-table; for example `seq_844448_to_900000 would dynamically give a table that would reach pretty far into the future.

Step 6 example:

mysql> SELECT DATE_FORMAT(FROM_UNIXTIME((844448) * 30*60), "%b %d %h:%i");
+-------------------------------------------------------------+
| DATE_FORMAT(FROM_UNIXTIME((844448) * 30*60), "%b %d %h:%i") |
+-------------------------------------------------------------+
| Mar 02 08:00                                                |
+-------------------------------------------------------------+
+---------------------------------------------------------------+
| DATE_FORMAT(FROM_UNIXTIME((844448+1) * 30*60), "%b %d %h:%i") |
+---------------------------------------------------------------+
| Mar 02 08:30                                                  |
+---------------------------------------------------------------+
Rick James
  • 135,179
  • 13
  • 127
  • 222
0

A different Approach without creating additional tables. May look like a hack though :-)

Step 1 : Generate a Time Table Dynamically

Assumption : INFORMATION_SCHEMA DB is avaialble and has a table COLLATIONS which normally has more than 100 records. You can use any table which has minimum 48 records

Query :

SELECT @time fromTime, ADDTIME(@time, '00:29:00')  toTime,  
@time := ADDTIME(@time, '00:30:00') 
FROM information_schema.COLLATIONS
JOIN (SELECT @time := TIME('00:00:00')) a
WHERE @time < '24:00:00'

Above query will give a table with from time and to time with an interval of 30 minutes.

Step 2 : Use the first query to generate required result joining urls table

Query :

SELECT CONCAT(fromTime, '-', toTime) AS halfHours, COUNT(created_at)
FROM 

(SELECT @time fromTime, ADDTIME(@time, '00:29:00')  toTime,  @time := ADDTIME(@time, '00:30:00') 
FROM information_schema.COLLATIONS
JOIN (SELECT @time := TIME('00:00:00')) a
WHERE @time < '24:00:00'
) timeTable

LEFT JOIN urls ON HOUR(created_at) BETWEEN HOUR(fromTime) AND HOUR(toTime) 
            AND MINUTE(created_at) BETWEEN MINUTE(fromTime) AND MINUTE(toTime) 
GROUP BY fromTime           

SQLFiddle

Akhil
  • 2,602
  • 23
  • 36
0

I hope this will work for,

SELECT 
@sTime:= CONCAT(HOUR(created_at),":",
    (CASE WHEN MINUTE(created_at) > 30 THEN 30 ELSE 0 END)) as intVar,
(CONCAT(
    AddTime(@sTime, '00:00:00'),
    ' to ',
    AddTime(@sTime, '00:30:00')
)) as timeInterval, 
COUNT(*) FROM urls 
GROUP BY 
(CONCAT(HOUR(created_at),":",(CASE WHEN MINUTE(created_at) > 30 THEN 30 ELSE 0 END))) 
ORDER BY HOUR(created_at) ASC
Skalbhile
  • 156
  • 13
0

Simply convert to sec and divide by 30 mins(1800secs). And to verify i used min, max on timestamp.

SELECT concat(TIME_FORMAT(min(created_at),"%H:%i")," - ", TIME_FORMAT(max(created_at),"%H:%i")) as hours,
       COUNT(*)
FROM urls
GROUP BY FLOOR(TIME_TO_SEC(created_at)/1800)
ORDER BY HOUR(created_at) ASC
Manish
  • 67
  • 6