-1

Please assume this table:

// mytable
+--------+-------------+---------+
|  num   | business_id | user_id |
+--------+-------------+---------+
| 3      | 503         | 12      |
| 7      | 33          | 12      |
| 1      | 771         | 13      |
| 2      | 86          | 13      |
| 1      | 772         | 13      |
| 4      | 652         | 14      |
| 4      | 567         | 14      |
+--------+-------------+---------+

I need to group it based on user_id, So, here is my query:

select max(num), user_id from mytable
group by user_id

Here is the result:

// res
+--------+---------+
|  num   | user_id |
+--------+---------+
| 7      | 12      |
| 2      | 13      |
| 4      | 14      |
+--------+---------+

Now I need to also get the business_id of those rows. Here is the expected result:

// mytable
+--------+-------------+---------+
|  num   | business_id | user_id |
+--------+-------------+---------+
| 7      | 33          | 12      |
| 2      | 86          | 13      |
| 4      | 567         | 14      |  -- This is selected randomly, because of the equality of values
+--------+-------------+---------+

Any idea how can I do that?

stack
  • 10,280
  • 19
  • 65
  • 117

1 Answers1

0

You don't group. You filter. One method uses window functions such as row_number():

select t.*
from (select t.*,
             row_number() over (partition by user_id order by num desc) as seqnum
      from mytable t
     ) t
where seqnum = 1;

Another method which can have slightly better performance with an index on (user_id, num) is a correlated subquery:

select t.*
from mytable t
where t.num = (select max(t2.num)
               from mytable t2
               where t2.user_id = t.user_id
              );

You should think "group by" when you want to summarize rows. You should think "where" when you want to choose rows with particular characteristics.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Great .. thanks buddy – stack Apr 17 '21 at 12:26
  • I've tested your query .. it doesn't handle duplicates .. https://i.stack.imgur.com/OApuW.png – stack Apr 17 '21 at 12:38
  • @stack . . . What do you want when there are duplicates? Your question doesn't specify. The use of `group by`, in fact, implies that you want just one row per `user_id`, which is what the first query does. – Gordon Linoff Apr 17 '21 at 13:17