0

I am trying to aggregate some call records and doing something wrong - but not sure what.

I have the following query:

SELECT 
    c.name, 
    IF(MONTH(start) = 1, SUM(duration),0) AS 'Jan',
    IF(MONTH(start) = 2, SUM(duration),0) AS 'Feb',
    IF(MONTH(start) = 3, SUM(duration),0) AS 'Mar',
    IF(MONTH(start) = 4, SUM(duration),0) AS 'Apr',
    IF(MONTH(start) = 5, SUM(duration),0) AS 'Jun',
    IF(MONTH(start) = 6, SUM(duration),0) AS 'Jul',
    IF(MONTH(start) = 7, SUM(duration),0) AS 'Aug'
FROM
    call_history ch, client c
WHERE
    ch.client_reseller_id = c.id
GROUP BY ch.client_reseller_id
ORDER BY c.name;

The field start contains the date/time of a call. The table only contains records for one year, so no worries about not filtering on year.

The results I get are not as expected:

+--------------------------------+----------+------+------+--------+------+------+------+
| name                           | Jan      | Feb  | Mar  | Apr    | Jun  | Jul  | Aug  |      
|+-------------------------------+----------+------+------+--------+------+------+------+
| Come company                   |  5243080 |    0 |    0 |      0 |    0 |    0 |    0 |
| Other cust                     |  4085085 |    0 |    0 |      0 |    0 |    0 |    0 |
| Someone                        |  1449543 |    0 |    0 |      0 |    0 |    0 |    0 |
| Demo Reseller                  |     2342 |    0 |    0 |      0 |    0 |    0 |    0 |
+--------------------------------+----------+------+------+--------+------+------+------+

What am I doing wrong?

btongeorge
  • 421
  • 2
  • 12
  • 23
  • what are your expected results? – Siva Aug 15 '14 at 11:03
  • That's because your `group by` usage is invalid (standard) SQL. Every other DBMS would reject your query but MySQL chooses to return "random" (they call it indeterminate) results instead and not bother you with an error message. For details see here: http://www.mysqlperformanceblog.com/2006/09/06/wrong-group-by-makes-your-queries-fragile/ and here: http://rpbouman.blogspot.de/2007/05/debunking-group-by-myths.html –  Aug 15 '14 at 11:32

2 Answers2

3

Change like this

SELECT 
    c.name, 
   SUM(IF(MONTH(start) = 1, duration,0)) AS 'Jan',
    SUM(IF(MONTH(start) = 2, duration,0)) AS 'Feb',
    SUM(IF(MONTH(start) = 3, duration,0)) AS 'Mar',
    SUM(IF(MONTH(start) = 4, duration,0)) AS 'Apr',
    SUM(IF(MONTH(start) = 5, duration,0)) AS 'Jun',
    SUM(IF(MONTH(start) = 6, duration,0)) AS 'Jul',
    SUM(IF(MONTH(start) = 7, duration,0)) AS 'Aug'
FROM
    call_history ch, client c
WHERE
    ch.client_reseller_id = c.id
GROUP BY ch.client_reseller_id
ORDER BY c.name;
Sathish
  • 4,419
  • 4
  • 30
  • 59
1

You want:

  SELECT c.name, 
         SUM((MONTH(ch.start)=1)*ch.duration) 'Jan',
         SUM((MONTH(ch.start)=2)*ch.duration) 'Feb',
         SUM((MONTH(ch.start)=3)*ch.duration) 'Mar',
         SUM((MONTH(ch.start)=4)*ch.duration) 'Apr',
         SUM((MONTH(ch.start)=5)*ch.duration) 'Jun',
         SUM((MONTH(ch.start)=6)*ch.duration) 'Jul',
         SUM((MONTH(ch.start)=7)*ch.duration) 'Aug'
    FROM call_history ch
    JOIN client c
      ON c.id = ch.client_reseller_id
GROUP BY c.name
ORDER BY c.name;

This uses the fact that booleans are evaluated as true => 1 and false => 0 in MySQL.

It avoids using an IF which is one more logical branch.. but I'm not sure which will be faster because of the * operation, it may be fun to benchmark. IF is also restricted to MySQL, you could use the standard CASE.

I could concede that some could find IF/CASE more readable than using the BOOLEAN => 1,0 dynamic which again may be restricted to MySQL.

I have also included the table aliases on the columns, changed your GROUP BY to the more obvious c.name and used an explicit JOIN.

These are readability issues..

Including the table aliases means you can read the query in isolation and know where each column is coming from.. for example when posting questions! I had to assume that start and duration were from the call_history, which doesn't matter here but absolutely can in other questions.

I tend to GROUP BY the non-aggregate columns in my SELECT in MySQL just to make sure I haven't missed anything. Correct me if I'm wrong, but I think other engines require it. If you have duplicate c.names I would select both c.id and c.name and GROUP BY both to remove any confusion.

As for use of implicit JOINs you can see a discussion here

Community
  • 1
  • 1
Arth
  • 12,789
  • 5
  • 37
  • 69