4

I have a table with several hundred thousand entries and I'm trying to use a query to get a result set for a specific receiver_id and group them by sender_id. My current SQL query works but I want to know if there could be any potential problems with using two MAX calls in the statement. It looks like this:

SELECT MAX(id) as id, sender_id, receiver_id, MAX(date) as date
FROM     messages
WHERE    receiver_id=5 and belong_to=5
GROUP BY sender_id

The table date looks like this:

id sender_id receiver_id content date                 belong_to           
-- --------- ----------- ------- -------------------  ---------
1  5         7           test    2013-03-11 10:33:54  7
2  5         7           test    2013-03-11 10:33:54  5
3  13        7           test 2  2013-03-13 12:01:36  7
4  13        7           test 2  2013-03-13 12:01:36  13
5  5         7           test 3  2013-03-14 09:15:37  7
6  5         7           test 3  2013-03-14 09:15:37  5
7  25        5           data1   2013-03-15 11:01:36  5
8  25        5           data1   2013-03-15 11:01:36  25
9  16        5           data2   2013-03-17 09:17:17  5
10 16        5           data2   2013-03-17 09:17:17  16
11 25        5           data3   2013-04-05 09:17:17  5
12 25        5           data3   2013-04-05 09:17:17  16

The output from my query is this:

id sender_id receiver_id date               
-- --------- ----------- -------------------
9  16        5           2013-03-17 09:17:17
11 25        5           2013-04-05 09:17:17

Are there any issues with this query using the MAX calls? If so what is the alternative?

Kaii
  • 20,122
  • 3
  • 38
  • 60
Paul
  • 11,671
  • 32
  • 91
  • 143
  • 1
    No this is fine. However, realize that the only reason you can include `Receiver_id` in the select clause without also grouping by `receiver_id` is because the where clause restricts the query to only one value for `receiver_id` [5]. If there were more than one receiver_id value in the results you would have to either also group by `receiver_id`, or not include it in the results – Charles Bretana May 11 '13 at 13:22
  • 2
    Note that `receiver_id` will be chosen 'at random' from one of the rows that fit under that group by, because it is not an aggregate and it is not grouped over in an aggregate/group by query. (Only MySQL will let you do this, all other SQL flavours will prevent such meaningless queries.) – Patashu May 11 '13 at 13:23
  • No, using multiple `MAX` aggregations is definitely intended to work. You may want to change your `GROUP BY` to `GROUP BY sender_id,receiver_id` though for portability's sake. – Joachim Isaksson May 11 '13 at 13:23
  • This is fine but this is not standard SQL. You'd have to group by all the fields of the query otherwise. – Sebas May 11 '13 at 13:24
  • 2
    Note that it's not guaranteed that both `MAX()` values refer to the same row. – JJJ May 11 '13 at 13:26
  • 2
    @Pataswhu, no, he is also restricting output to only `receiver_id = 5`. This is perfectly ok. – Charles Bretana May 11 '13 at 13:27
  • @Juhana This is my concern...I'm assuming it'll always be the same row... – Paul May 11 '13 at 13:29
  • Is there a better way to retrieve the same result set? – Paul May 11 '13 at 13:31
  • What are you actually trying to do? – Strawberry May 11 '13 at 13:31
  • @Strawberry I'm trying to get a unique list of sender_ids for a particular receiver_id with the most recent entry (date) for each. – Paul May 11 '13 at 13:32
  • I thought so... I'd show you how to do that but I suspect someone will beat me to it... – Strawberry May 11 '13 at 13:33
  • 2
    @Paul if you want it to be the same row for both, then the meaning of Max(Id) and Max(date) becomes un clear, in one row there Is only one value for each of these columns. What rule do you wish to use to identify this one row? the id ? or the Date? you cannot use both as they may point to different rows. – Charles Bretana May 11 '13 at 13:33
  • @Strawberry I'm all ears :) – Paul May 11 '13 at 13:33
  • @Charles Bretana It's not okay imo, it's a bad habit especially if you don't know what the problem is. You could group by receiver_id for example. – Patashu May 11 '13 at 13:46
  • 1
    @Patashu, I could agree it is a bad habit. But it's perfectly legal, correct and functional. It is no different than `Select ColumnA, Max(someOtherClumn) From table where COlumnA = [SomeValue]` which we do all the time. And if the query restricts output to only one `reciever_id`, it would be kinda foolish to group by that column, just to conform to your esthetic preferences, no? – Charles Bretana May 11 '13 at 14:02

2 Answers2

3

I don't quite understand your structure (so, for instance, this example assumes that a UNIQUE key could be imposed upon sender_id, receiver_id, date, belong_to), but I suspect you want something like this. Filter by user as necessary..

SELECT x.* 
  FROM messages x
  JOIN 
     ( SELECT sender_id
            , receiver_id
            , MAX(date) max_date 
         FROM messages 
        GROUP 
           BY receiver_id
            , sender_id
     ) y 
    ON y.sender_id = x.sender_id 
   AND y.receiver_id = x.receiver_id 
   AND y.max_date = x.date
 WHERE x.belong_to = x.receiver_id;
Strawberry
  • 33,750
  • 13
  • 40
  • 57
  • This seems to be working great...all I had to do was append: 'AND x.receiver_id = 5' – Paul May 11 '13 at 14:03
0

Based on comments what you want is:

' unique [list of sender_ids for a particular receiver_id with the most recent entry (date) for each[receiver]'

If you mean ALL Senders who have the latest entry date for that single receiver, then:

Select * From Messages m
Where date = (Select Max(date) From messages
              Where receiver_id = m.receiver_id) 
    And receiver_id = 5 -- add this if you only want results for one receiver_id

If otoh, you mean ' unique list of sender_ids for a particular receiver_id with the most recent entry (date) for each[receiver-sender combination]', then do this

Select * From Messages m
Where date = (Select Max(date) From messages
              Where Sender_id = m.Sender_id 
                  And receiver_id = m.receiver_id) 
    And receiver_id = 5 -- add this if you only want results for one receiver_id
Charles Bretana
  • 143,358
  • 22
  • 150
  • 216
  • This doesn't look right. Which part guarantees unique sender_ids? – JJJ May 11 '13 at 13:38
  • As I read Paul's comment, He doesn't want unique sender Ids, he wants all `senderId`s that have the latest send date for a single receiver. But on chance you are right, I edited answer to include both possibilities – Charles Bretana May 11 '13 at 13:43