2

I've read similar questions here on stackoverflow, but the OP's table structure is never quite the same as mine, so the answer doesn't work for me. The posts I've read are only trying to GROUP BY one column as opposed to two. I'm using MySQL, latest stable release.

Here's my table "reference":

id    formatID    referenceTime
1     1           2011-6-12 12:40
2     2           2011-6-12 1:04
3     4           2011-6-12 1:03
4     2           2011-6-12 15:20
5     3           2011-6-12 9:30
6     3           2011-6-12 2:55
7     5           2011-6-12 13:15
8     1           2011-6-12 12:32
(etc)


I want to create a query that show how many of each type of format occurred by hour of day. The point of this is to see what is the busiest time of day. I am trying to write a query that will create output that I can use for some simple graph web apps (Highcharts.js). I want it to look like this:

Timeofday  Subgroup  Count
12AM       1         2
12AM       2         6
12AM       3         7
12AM       4         2
12AM       5         0
1AM        1         3
1AM        2         3
1AM        3         0
1AM        4         0
1AM        5         1
(etc)


I'm using this query:

SELECT date_format(referenceTime,'%I %p') AS timeofday,
  reference.referenceFormatID AS subgroup,
  count(*) AS count
FROM reference
GROUP BY timeofday,subgroup ASC


However, the output skips "rows" where the count equals zero and so ends up looking like this:

Timeofday  Subgroup  Count
12AM       1         2
12AM       2         6
1AM        3         7
1AM        4         2
1AM        5         1
3AM        1         3
6AM        2         3
7AM        3         1
7AM        4         1
9AM        5         1
(etc)

I need those zeros to be able to create a properly formatted data series for my app.

The LEFT JOIN method where you put all the times into a second table isn't working for me because I am grouping by two different columns. Apparently, the LEFT JOIN criteria is satisfied as long as each hour shows up somewhere in the output table, but I need each hour to appear for each format.

Any suggestions?

Joshua Welker
  • 547
  • 1
  • 9
  • 20

1 Answers1

2

You have two options, either create a lookup table with the possible hours in it, or use strange query involving the dual table and union to get the values that you are looking for.

In the first case, you would have a table with maybe a single field for the moment, let's just call it hours and the field is timeofday.

In the hours timeofday, you would have the following data:

timeofday
12AM
1AM
2AM
....

Then your query is as simple as

SELECT hours.timeofday, 
  reference.referenceFormatID AS subgroup,
  count(reference.referenceFormatID) AS count
FROM hours
 LEFT JOIN reference on date_format(referenceTime,'%I %p') = hours.timeofday
GROUP BY hours.timeofday,subgroup ASC

EDIT

To get all combinations, you would also need a formats table with all the possible formatIDs as was mentioned by rfausak. You could also do this with a distinct, but let's just assume that you have this table, let's call it formats. Again, this table could have a single column.

Part 1 is to get all the combinations:

SELECT hours.timeofday, 
       formats.ID
from hours
join formats

This is a Cartesian join that would merge all possible hours and format IDs.

Now we add in the LEFT JOIN

SELECT hours.timeofday, 
       formats.ID,
       count(reference.subgroup)
FROM hours
JOIN formats
LEFT JOIN reference on date_format(referenceTime,'%I %p') = hours.timeofday
     AND reference.subgroup  = formats.ID
GROUP BY hours.timeofday,formats.ID ASC

If you try to do it using a DUAL table look up, you can use a method similar to generate days from date range

Community
  • 1
  • 1
Brian Hoover
  • 7,861
  • 2
  • 28
  • 41
  • 1
    Third option would be to use a programming language. Also I think he needs a `formats` table in order to list all the results that he wants. – Rusty Fausak Dec 21 '11 at 21:25
  • 1
    The query you gave me does indeed produce zero rows for hours with no records, but it does not produce zero rows for every subgroup within every hour. Does that make sense? – Joshua Welker Dec 21 '11 at 22:53
  • It worked!!! You saved the day, friend. I would never have figured that out. I need to look up what a Cartesian join is. If you're still around, can you explain to me how count(reference.subgroup) differs functionally from the count(*) I originally had? – Joshua Welker Dec 22 '11 at 13:53
  • count(reference.subgroup) only counts cases where reference.subgroup is not null. count(*) counts all the rows. – Brian Hoover Dec 22 '11 at 14:19