1

I have a view with a lot of data. I can got this results using subqueries (data that is ok and optimized):

+------------+
| attendance |
+------------+
|        319 |
|        102 |
|        598 |
|        113 |
|          6 |
|        279 |
|        366 |
|        146 |
|        669 |
|        205 |
|        123 |
+------------+

The next time some user update data, it shows this:

+------------+
| attendance |
+------------+
|        319 |
|        102 |
|        598 |
|        113 |
|          7 |
|        279 |
|        253 |
|        146 |
|        669 |
|        561 |
|        123 |
+------------+

Which is ok, 'cause the user that update the information was the one that before has 6 as attendance.

But the problem comes when I use that data as a temptable and then I make:

 SELECT SUM(attendance) AS total FROM ( /* Subquery returning the above table */)

cause it returns (in the first place with one user having 6 as attendance):

+-------+
| total |
+-------+
|  3169 |
+-------+

And with 7:

+-------+
| total |
+-------+
|  3128 |
+-------+

When it should be 3170!!!

Ideas?

EDIT 1: Pasting the full query.

SELECT SUM(att_member) AS total
FROM
    (SELECT attendance AS att_member
     FROM
         (SELECT id_branch_channel, id_member, attendance, TIMESTAMP, id_event
          FROM view_event_attendance
          WHERE id_event = 782
          ORDER BY TIMESTAMP DESC) AS temptable
     GROUP BY 
              id_member) AS total_attendance_temp

EDIT 2: Pasting the query help I got from here

Select only last value using group by at mysql

Here is the schema of the view.

Community
  • 1
  • 1
Cito
  • 1,659
  • 3
  • 22
  • 49
  • 1
    Either it's a MySQL bug or there's a problem with the subquery. I'm betting on the subquery :) Can you post it? – Ed Gibbs Apr 19 '13 at 21:11
  • unless the queries are extraordinarily huge/ugly, can you show them here? – Marc B Apr 19 '13 at 21:12
  • 2
    Also note that the lists of values in your post have differences beyond the 6 and 7 on row 5. Row 7 has also 366 in the first list and 253 in the second, and row 10 has 205 in the first list and 561 in the second. – Ed Gibbs Apr 19 '13 at 21:19
  • @EdGibbs Yeap, you are right. I haven't noticed that. I'll search on the subquery then... – Cito Apr 19 '13 at 21:21
  • Do you need the subquery for something else? Because you can sum the attendance just by using `SELECT SUM(attendance) AS total FROM view_event_attendance WHERE id_event = 782` – Ed Gibbs Apr 19 '13 at 21:21
  • @EdGibbs Look at this: http://stackoverflow.com/questions/15421616/select-only-last-value-using-group-by-at-mysql – Cito Apr 19 '13 at 21:22
  • 1
    @AbrahamSustaita - thanks for that link; now I understand why you need the subquery :) BTW, excellent work on that question! – Ed Gibbs Apr 19 '13 at 21:25

2 Answers2

1

Lets disect the query, shall we?

I assume view_event_attendance has one record for every attendee (member) who attended at an event. id_event is a FK to that event, id_member is FK to the attendee. Your inner select gives you an ordered list of all members who attended event #782

SELECT id_branch_channel, id_member, attendance, TIMESTAMP, id_event
FROM view_event_attendance
WHERE id_event = 782
ORDER BY TIMESTAMP DESC

So far, so hoopy. Now you wrap this query in another one:

SELECT attendance AS att_member
FROM (subquery)
GROUP BY id_member

In most SQL dialetcs, this is simply a syntax error. MySQL allows this, but the result is probably not what you were looking for. You will get attendance column for every id_member who attended in said event. What you might actually expect is the SUM of attendances, but I you don't say so in your question. In any case, every selected field should either be in your GROUP BY clause or using an aggregate function, such as

SELECT SUM(attendance) AS att_member
FROM (subquery)
GROUP BY id_member

or

SELECT attendance AS att_member
FROM (subquery)
GROUP BY id_member, attendance

Having that said, I don't see a need for this to use a subquery to begin with. Let's assume you wanted to get the SUM as above, you could reprase this into a single SQL query:

SELECT SUM(attendance) AS att_member
FROM view_event_attendance
WHERE id_event = 782
GROUP BY id_member

If you then wanted the total, you could simply leave out the GROUP BY clause, leaving you with this:

SELECT SUM(attendance) AS att_member
FROM view_event_attendance
WHERE id_event = 782

If this doesn't work as expected, please describe in more detail what you're actually storing in view_event_attendance, and what you want the second query to calculate.

Hazzit
  • 6,782
  • 1
  • 27
  • 46
  • Thanks for your help! But I think if you [read the other link I pasted](http://stackoverflow.com/questions/15421616/select-only-last-value-using-group-by-at-mysql) then you will understand why I need the subquery. I've found, however my error. Reading again that link I remember I never use the answer there, so doing it like that fixed it. Any comments on there? Look the answer I'll give here. And thanks again :) – Cito Apr 19 '13 at 21:36
0

As @EdGibbs noticed, the subquery was returning bad results (something I had not noticed). So I read again my code and the answer to the other question I make reference and the things now work as they should, by doing this:

SELECT SUM(att_member) AS total
FROM
    (SELECT attendance AS att_member
     FROM
         (SELECT 
            id_branch_channel, 
            id_member, 
            substring(max(concat(from_unixtime(timestamp),attendance)) from 20) as attendance, 
            timestamp, 
            id_event
          FROM view_event_attendance
          WHERE id_event = 782
          GROUP BY id_event, id_member
          ORDER BY timestamp DESC) AS temptable
     ) AS total_attendance_temp

The only change I did was:

  • use: substring(max(concat(from_unixtime(timestamp),attendance)) from 20) as attendance instead of only attendace.
  • moving the group inside the first subquery.

So the problem was not in the SUM function at MySQL but in my query.

Thanks to all for taking the time to help me with this. Your suggestions help me to get the answer.

Community
  • 1
  • 1
Cito
  • 1,659
  • 3
  • 22
  • 49
  • I now understand that you *are in fact relying* on non-standard behaviour. I can see why you're doing it, but what may look like a smart solution now may turn out to be the *worst thing ever*(tm) in hindsight. – Hazzit Apr 19 '13 at 21:48
  • So then what is your suggestion? Let me explain what I'm doing. I need to have the log of the attendance to the events. So for that reason I have a table with only that information: id_member_event(this is a fk to a table with the information of the event-member relation), attendance, timestamp. So in that table I can have more than once attendance for an event (let say, that an user send more than on in a large event) – Cito Apr 19 '13 at 21:53
  • In my view_event_attendance I have the information of the whole event: id_event,id_member,_id_member_event,id_branch_channel,id_member_event_attendance,is_online,attendance,timestamp – Cito Apr 19 '13 at 21:55
  • I guess it all depends on what your inner view actually does, and if there is a way to use an aggregate function (e.g. MAX()) to retrieve the latest attendance count. If there is no way to optimize it further, you may want to consider denormalizing your log, so that the parent record gets a latest_attendance column. Looking at two subqueries and dependancy on non-standard behaviour already, and imagining your view also is non-trivial, I suspect your solution is a) hard to understand, b) hard to document fully and c) a potential performance hog when the underlying tables get large. – Hazzit Apr 19 '13 at 22:16
  • @Hazzit Can you give me an example of how that query can be recoded to use MAX()? – Cito Apr 22 '13 at 21:36