19

I have one table with data about attendance into some events. I have in the table the data of the attendance everytime the user sends new attendance, the information is like this:

mysql> SELECT id_branch_channel, id_member, attendance, timestamp, id_member FROM view_event_attendance WHERE id_event = 782;
+-------------------+-----------+------------+------------+-----------+
| id_branch_channel | id_member | attendance | timestamp  | id_member |
+-------------------+-----------+------------+------------+-----------+
|              1326 |    131327 |        459 | 1363208604 |    131327 |
|              1326 |    131327 |        123 | 1363208504 |    131327 |
|              1326 |    131327 |          1 | 1363208459 |    131327 |
|              1326 |     93086 |          0 |       NULL |     93086 |
|              1326 |     93087 |          0 |       NULL |     93087 |
|              1326 |     93088 |          0 |       NULL |     93088 |
|              1326 |     93093 |          0 |       NULL |     93093 |
|              1326 |     99113 |          0 |       NULL |     99113 |
|              1326 |     99135 |          0 |       NULL |     99135 |
|              1326 |     99199 |          0 |       NULL |     99199 |
|              1326 |     99200 |          0 |       NULL |     99200 |
|              1326 |    131324 |          0 |       NULL |    131324 |
|              1326 |     85850 |          0 |       NULL |     85850 |
|              1326 |     93085 |          0 |       NULL |     93085 |
+-------------------+-----------+------------+------------+-----------+
14 rows in set (0.00 sec)

(This is actually a view, for that reason some of the fields are null).

I can groupby id_member so I get only one row for every member (that is, only the last attendance the user sent). However, when I do it, I received the first attendance the user sent, not the last one.

mysql> SELECT id_branch_channel, id_member, attendance, timestamp, id_member FROM view_event_attendance WHERE id_event = 782 GROUP BY id_event,id_member;
+-------------------+-----------+------------+------------+-----------+
| id_branch_channel | id_member | attendance | timestamp  | id_member |
+-------------------+-----------+------------+------------+-----------+
|              1326 |    131327 |          1 | 1363208459 |    131327 |
|              1326 |     93086 |          0 |       NULL |     93086 |
|              1326 |    131324 |          0 |       NULL |    131324 |
|              1326 |     93087 |          0 |       NULL |     93087 |
|              1326 |     93088 |          0 |       NULL |     93088 |
|              1326 |     93093 |          0 |       NULL |     93093 |
|              1326 |     99113 |          0 |       NULL |     99113 |
|              1326 |     99135 |          0 |       NULL |     99135 |
|              1326 |     85850 |          0 |       NULL |     85850 |
|              1326 |     99199 |          0 |       NULL |     99199 |
|              1326 |     93085 |          0 |       NULL |     93085 |
|              1326 |     99200 |          0 |       NULL |     99200 |
+-------------------+-----------+------------+------------+-----------+
12 rows in set (0.00 sec)

I already tried to add ORDER BY clausules, but they are not working at all... any ideas?

Thanks in advance!

Edit: this is the script that creates the table

CREATE OR REPLACE VIEW view_event_attendance 
    AS
        SELECT 
            tbl_event.id_event,
            tbl_member_event.id_member,
            tbl_event.id_branch_channel,
            tbl_member_event_attendance.id_member_event_attendance,
            IF(ISNULL(tbl_member_event_attendance.attendance), 0, tbl_member_event_attendance.attendance) AS attendance,
            tbl_member_event_attendance.timestamp
        FROM 
            tbl_event
            INNER JOIN 
                tbl_member_event ON tbl_member_event.id_event = tbl_event.id_event
                LEFT OUTER JOIN
                    tbl_member_event_attendance ON tbl_member_event_attendance.id_member_event = tbl_member_event.id_member_event
        ORDER BY 
            tbl_member_event_attendance.timestamp DESC;

EDIT 2:

Thanks a lot MichaelBenjamin, but the problem when using subqueries is the size of the view:

mysql> DESCRIBE SELECT id_branch_channel, id_member, attendance, timestamp, id_member 
    -> FROM (select * from view_event_attendance order by timestamp desc) as whatever
    -> WHERE id_event = 782 
    -> GROUP BY id_event,id_member;
+----+-------------+-----------------------------+--------+-----------------+-----------------+---------+------------------------------------------------+-------+----------------------------------------------+
| id | select_type | table                       | type   | possible_keys   | key             | key_len | ref                                            | rows  | Extra                                        |
+----+-------------+-----------------------------+--------+-----------------+-----------------+---------+------------------------------------------------+-------+----------------------------------------------+
|  1 | PRIMARY     | <derived2>                  | ALL    | NULL            | NULL            | NULL    | NULL                                           | 16755 | Using where; Using temporary; Using filesort |
|  2 | DERIVED     | tbl_member_event            | index  | id_event        | id_event        | 8       | NULL                                           | 16346 | Using index; Using temporary; Using filesort |
|  2 | DERIVED     | tbl_event                   | eq_ref | PRIMARY         | PRIMARY         | 4       | video_staging.tbl_member_event.id_event        |     1 |                                              |
|  2 | DERIVED     | tbl_member_event_attendance | ref    | id_event_member | id_event_member | 4       | video_staging.tbl_member_event.id_member_event |     1 | Using index                                  |
+----+-------------+-----------------------------+--------+-----------------+-----------------+---------+------------------------------------------------+-------+----------------------------------------------+
4 rows in set (0.08 sec)

As you can see there are a lot of rows in my table, so for that reason I don't want to use subqueries...

EDIT 3:

But adding WHERE to the subquery it looks better...

mysql> DESCRIBE SELECT id_branch_channel, id_member, attendance, timestamp, id_member 
    -> FROM (select * from view_event_attendance where id_event = 782 order by timestamp desc) as whatever
    -> WHERE id_event = 782 
    -> GROUP BY id_event,id_member;
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+----------------------------------------------+
| id | select_type | table                       | type  | possible_keys   | key             | key_len | ref                                            | rows | Extra                                        |
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+----------------------------------------------+
|  1 | PRIMARY     | <derived2>                  | ALL   | NULL            | NULL            | NULL    | NULL                                           |   14 | Using where; Using temporary; Using filesort |
|  2 | DERIVED     | tbl_event                   | const | PRIMARY         | PRIMARY         | 4       |                                                |    1 | Using temporary; Using filesort              |
|  2 | DERIVED     | tbl_member_event            | ref   | id_event        | id_event        | 4       |                                                |   12 | Using index                                  |
|  2 | DERIVED     | tbl_member_event_attendance | ref   | id_event_member | id_event_member | 4       | video_staging.tbl_member_event.id_member_event |    1 | Using index                                  |
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+----------------------------------------------+
4 rows in set (0.01 sec)

If I can not find anything else not using subqueries, I think I'll choose this as the answer...

Edit 4

After seeing the comments in the answer, I've decided to select another as the answer. Here is the DESCRIBE for both queries, and I think it is obvious what is the best solution:

mysql> DESCRIBE SELECT 
    ->   id_branch_channel,
    ->   id_member, 
    ->   attendance, 
    ->   timestamp,
    ->   id_member
    -> FROM view_event_attendance AS t1
    -> WHERE id_event = 782
    -> AND timestamp = (SELECT MAX(timestamp)
    ->                  FROM view_event_attendance AS t2 
    ->                  WHERE t1.id_member = t2.id_member 
    ->                    AND t1.id_event = t2.id_event 
    ->                  GROUP BY id_event, id_member)
    -> OR timestamp IS NULL
    -> GROUP BY id_event, id_member;
+----+--------------------+-----------------------------+--------+--------------------+--------------------------+---------+------------------------------------------------+------+-----------------------------------------------------------+
| id | select_type        | table                       | type   | possible_keys      | key                      | key_len | ref                                            | rows | Extra                                                     |
+----+--------------------+-----------------------------+--------+--------------------+--------------------------+---------+------------------------------------------------+------+-----------------------------------------------------------+
|  1 | PRIMARY            | tbl_event                   | index  | PRIMARY            | id_member_branch_channel | 4       | NULL                                           |  208 | Using index; Using temporary; Using filesort              |
|  1 | PRIMARY            | tbl_member_event            | ref    | id_event           | id_event                 | 4       | video_staging.tbl_event.id_event               |   64 | Using index                                               |
|  1 | PRIMARY            | tbl_member_event_attendance | ref    | id_event_member    | id_event_member          | 4       | video_staging.tbl_member_event.id_member_event |    1 | Using where; Using index                                  |
|  2 | DEPENDENT SUBQUERY | tbl_event                   | eq_ref | PRIMARY            | PRIMARY                  | 4       | func                                           |    1 | Using where; Using index; Using temporary; Using filesort |
|  2 | DEPENDENT SUBQUERY | tbl_member_event            | eq_ref | id_event,id_member | id_event                 | 8       | video_staging.tbl_event.id_event,func          |    1 | Using where; Using index                                  |
|  2 | DEPENDENT SUBQUERY | tbl_member_event_attendance | ref    | id_event_member    | id_event_member          | 4       | video_staging.tbl_member_event.id_member_event |    1 | Using where; Using index                                  |
+----+--------------------+-----------------------------+--------+--------------------+--------------------------+---------+------------------------------------------------+------+-----------------------------------------------------------+
6 rows in set (0.00 sec)


mysql> DESCRIBE SELECT *
    -> FROM (SELECT id_branch_channel, id_member, attendance, timestamp, id_event 
    ->       FROM view_event_attendance 
    ->       WHERE id_event = 782 
    ->       ORDER BY timestamp desc
    ->      ) as whatever
    -> GROUP BY id_event,id_member;
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+---------------------------------+
| id | select_type | table                       | type  | possible_keys   | key             | key_len | ref                                            | rows | Extra                           |
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+---------------------------------+
|  1 | PRIMARY     | <derived2>                  | ALL   | NULL            | NULL            | NULL    | NULL                                           |   14 | Using temporary; Using filesort |
|  2 | DERIVED     | tbl_event                   | const | PRIMARY         | PRIMARY         | 4       |                                                |    1 | Using temporary; Using filesort |
|  2 | DERIVED     | tbl_member_event            | ref   | id_event        | id_event        | 4       |                                                |   12 | Using index                     |
|  2 | DERIVED     | tbl_member_event_attendance | ref   | id_event_member | id_event_member | 4       | video_staging.tbl_member_event.id_member_event |    1 | Using index                     |
+----+-------------+-----------------------------+-------+-----------------+-----------------+---------+------------------------------------------------+------+---------------------------------+
4 rows in set (0.00 sec)
Cito
  • 1,659
  • 3
  • 22
  • 49
  • I believe you can do what you're trying for by using max() or a similar function on one of the grouped columns. i don't have a similar table set up right now to test with. – Eric Mar 14 '13 at 22:47
  • You could check my answer for a solution that works without subqueries or joins. – JodyT Mar 14 '13 at 23:25
  • I'd be interested to see actual times; the describe difference is what I'd expect and doesn't lead me to think one is clearly better. And it would be nice if you gave my answer a try too :) – ysth Mar 15 '13 at 18:10
  • @ysth I wish to, but I do not know how to make it... Can you please clarify me? Maybe you can do it here: http://sqlfiddle.com/#!2/5a6e7/1 – Cito Mar 15 '13 at 21:19
  • ok: http://sqlfiddle.com/#!2/af2a1/1 – ysth Mar 15 '13 at 21:41
  • possible duplicate of [Retrieving the last record in each group](http://stackoverflow.com/questions/1313120/retrieving-the-last-record-in-each-group) – Alastair Irvine Apr 13 '15 at 14:33

6 Answers6

12

Use a simple group by id_member, but select:

substring(max(concat(from_unixtime(timestamp),attendance)) from 20) as attendance

This attaches attendance to the timestamp for each row in a group, in order to be able to select the desired timestamp/attendance with max() and then extract just the attendance.

What concat() returns is 19 characters of formatted timestamp (YYYY-mm-dd HH:MM:SS) with the attendance appended starting at character 20; the substring(... from 20) gets just the attendance from the (stringwise) maximum one for the group. You can remove the group by and just

select concat(from_unixtime(timestamp),attendance), timestamp, attendance

to get a better idea of how it uses max to get the right attendance.

ysth
  • 96,171
  • 6
  • 121
  • 214
  • 1
    Can You Update your answer with the full query? I'll add the explain later. Thanks for your help. Also, could you explain why the `from 20`? – Cito Mar 16 '13 at 00:58
  • 2
    I'd rather just provide the expression to select, since no other changes are necessary to your original query; having to wade through the full query will make it harder for someone else to find and use this idea with their query. – ysth Mar 16 '13 at 01:07
  • very clever, and dramatically faster than the subquery-and-join approach – Brad Mace Jan 08 '14 at 22:42
  • I love this solution, it's the faster ever for this problem. You have to pay attention and consider length of your discriminating fields but with right and left it's simple. Great! – AndreaTaroni86 Mar 19 '21 at 14:52
4
SUBSTRING_INDEX(SUBSTRING_INDEX(group_concat(%requiredfield%), ',', count(*)),',',-1)

This will get the last value of the 'required field' from any group_concat, if unsorted it will be the last value in the table by default.

Could use group_concat_ws to account for possible null fields.

forpas
  • 160,666
  • 10
  • 38
  • 76
Sarang
  • 51
  • 5
3
SELECT id_branch_channel, id_member, attendance, timestamp, id_member 
FROM (select * from view_event_attendance order by timestamp desc) as whatever
WHERE id_event = 782 
GROUP BY id_event,id_member;

EDIT: This may yield better performance:

SELECT *
FROM (SELECT id_branch_channel, id_member, attendance, timestamp, id_member 
      FROM view_event_attendance 
      WHERE id_event = 782 
      ORDER BY timestamp desc
     ) as whatever
GROUP BY id_event,id_member;

As long as the result-set can fit into the Innodb_buffer_pool, you will not see a significant performance drop.

Michael Benjamin
  • 2,895
  • 1
  • 16
  • 18
3

The following works, but unfortunately it has a subquery again and won't perform much better than the other solutions posted here.

SELECT 
  id_branch_channel,
  id_member, 
  attendance, 
  timestamp,
  id_member
FROM view_event_attendance AS t1
WHERE id_event = 782
AND timestamp = (SELECT MAX(timestamp)
                 FROM view_event_attendance AS t2 
                 WHERE t1.id_member = t2.id_member 
                   AND t1.id_event = t2.id_event 
                 GROUP BY id_event, id_member)
OR timestamp IS NULL
GROUP BY id_event, id_member;
miken32
  • 42,008
  • 16
  • 111
  • 154
JodyT
  • 4,324
  • 2
  • 19
  • 31
  • I'll test It tomorrow and let You know if It fits my needs. If does then this will be the answer. Thanks for your help. – Cito Mar 15 '13 at 03:58
  • This actually works and I prefer this solution. I think it cost less to the database, and I really need to improve in this regard. Thanks a lot to everyone for your help. I already knew how to do it with subqueries, but I don't like it so much, so for this reason I'm chosing this as the solution to my problem. – Cito Mar 15 '13 at 04:06
  • can you explain how it works? (because I think you are completely misunderstanding what `having max(timestamp)` does) – ysth Mar 15 '13 at 04:24
1

Here is one option (untested):

SELECT v.id_branch_channel, v.id_member, v.attendance, v.timestamp, v.id_member 
FROM view_event_attendance v
    JOIN (
        SELECT id_event, id_member, MAX(attendance) maxattendance
        FROM view_event_attendance 
        GROUP BY id_event, id_member ) m ON 
            v.id_event = m.id_event AND
            v.id_member = m.id_member AND
            v.attendance = m.maxattendance
WHERE v.id_event = 782 
GROUP BY v.id_member;

The concept is to get the MAX() of timestamp and use that field to JOIN on your view. You might not need all the fields -- really depends on your table structure. But this should get you going in the correct direction.

sgeddes
  • 62,311
  • 6
  • 61
  • 83
  • It works great! But the only problem is that only returns one row... I want all the 12 rows in the example I gave but the first one, instead of being the row with attendance 1, the row with attendance 459. Thanks for your help. I'll check your code and see if I can modfiy it to my benefit (although I wish not to do subqueries...) – Cito Mar 14 '13 at 22:58
  • @AbrahamSustaita -- The only other option I can think of without using subqueries is to use user-defined variables in your query. However, I don't think you'd get much of a performance increase with that -- in fact, it might be slower. I didn't completely understand your question -- replace MAX(timestamp) with MAX(attendance) and you should get the 459 instead (obviously update your JOIN as well)... – sgeddes Mar 14 '13 at 23:11
  • 1
    @sgeddes: I believe he wants the attendance from the row with the greatest timestamp, even if the timestamp is lower. – ysth Mar 15 '13 at 04:38
-1

One way to do this is to use a window function and a subquery, if you add an entry to your select list as row_number() over (partition by id_member order by timestamp desc) this will resolve to a number ordering the rows by timestamp (with 1 being the oldest) grouped in each id_member group (run it if this doesn't make sense, it will be clear). You can then select from this as a subquery where the extra column = 1 which will only select the rows with the highest timestamp within each group.

Steve
  • 1,215
  • 6
  • 11
  • Thanks for your help... but I really have no idea what do you mean... sorry. Can you please give me an example with the query I added? Thanks a lot for your willing to help me! :) – Cito Mar 14 '13 at 22:54
  • window functions require mysql 8.0 or mariadb 10.2 – ysth Jun 16 '19 at 22:04