44

I have the following table called questions:

ID | asker 
1  | Bob
2  | Bob
3  | Marley

I want to select each asker only once and if there are multiple askers with the same name, select the one of the highest id. So, the expected results:

ID | asker 
3  | Marley
2  | Bob

I use the following query:

SELECT * FROM questions GROUP by questions.asker ORDER by questions.id DESC

I get the following result:

ID | asker 
3  | Marley
1  | Bob

It selects the first 'Bob' it encounters instead of the last one.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Michael Samuel
  • 3,820
  • 12
  • 45
  • 85

7 Answers7

68

If you want the last id for each asker, then you should use an aggregate function:

SELECT max(id) as id, 
   asker
FROM questions 
GROUP by asker 
ORDER by id DESC

The reason why you were getting the unusual result is because MySQL uses an extension to GROUP BY which allows items in a select list to be nonaggregated and not included in the GROUP BY clause. This however can lead to unexpected results because MySQL can choose the values that are returned. (See MySQL Extensions to GROUP BY)

From the MySQL Docs:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. ... You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.

Now if you had other columns that you need to return from the table, but don't want to add them to the GROUP BY due to the inconsistent results that you could get, then you could use a subquery to do so. (Demo)

select 
  q.Id,
  q.asker,
  q.other -- add other columns here
from questions q
inner join
(
  -- get your values from the group by
  SELECT max(id) as id, 
    asker
  FROM questions 
  GROUP by asker 
) m
  on q.id = m.id
order by q.id desc
Taryn
  • 242,637
  • 56
  • 362
  • 405
  • @itsazzad What do you mean "not working for me"? That's not the most descriptive explanation of a problem. – Taryn Apr 19 '16 at 11:39
  • 5
    if I use `SELECT *, max(id) as id` then its not giving correct result – itsazzad Apr 19 '16 at 11:44
  • @itsazzad Are you using `GROUP BY` with my version? – Taryn Apr 19 '16 at 11:47
  • @itsazzad Well you're not providing a lot of details on what might be the problem, as long as you figured it out using another answer great. – Taryn Apr 19 '16 at 11:52
  • this selects only the max id in the group and add it's value to another row -the first row- of the group. – Accountant م Sep 09 '16 at 13:28
  • @theaccountant I'm not sure what you are asking. This gives the max id value for each group which is what the OP asked for. – Taryn Sep 09 '16 at 13:33
  • I mean it's not selecting the whole -every column- max row in the group. it still selects the first row in the group and add another column `max(id) as id` to the selected row. it worked here because we only have columns id and asker. if we have more column they will be belonging to the first row in the group – Accountant م Sep 09 '16 at 13:54
  • 1
    @theaccountant if you need more columns then you'll add them to the group by or you will need a subquery – Taryn Sep 09 '16 at 14:29
  • what if i need all data – TarangP Jun 24 '18 at 08:34
  • This answer does not help when we need to select also another columns. We will get mixed values in query result from different table rows. See my answer here to solve it https://stackoverflow.com/a/55452997/3826175 – mikep Apr 01 '19 at 11:28
  • @mikep Actually it does help. MySQL allows you to include other columns in the select list that are not in the GROUP BY - this is what leads to inconsistent results. If as you point out in your answer you want other columns, then you either need to group by them or you have to use a subquery to return those columns as well - here is an example of that - http://sqlfiddle.com/#!9/a82d769/5/0 – Taryn Apr 01 '19 at 12:09
  • @Taryn Your solution with inner join and subquery is too long (although you used shortcuts for tables) and too complicated when compared to my solution in my answer. But the worst thing is performance. I have similar use case such is in this question. My table has 4.500.000 rows. I tried to re-write my easy solution with `GROUP BY ... DESC` or `GROUP BY col1 ORDER BY col1 DESC` (to be compatible with MySQL 8.0+) to your solution and it is 5 times slower. When you downvoted my answer you should leave comment why. – mikep Apr 01 '19 at 13:22
  • @mikep I disagree that it's too long, it returns the result desired and follows SQL standards. As far as performance, indexes can help most queries depending on the system. Finally, as far as "when I downvoted your answer" - I didn't, votes are anonymous and by you posting a late answer it bubbles the question to recently active, so you can't assume the vote came from me. – Taryn Apr 01 '19 at 13:33
  • @Taryn Of course I was comparing both alternatives already with index on column used in GROUP BY. Using GROUP BY on column without index is really bad practice because of creating tmp table ( https://dev.mysql.com/doc/refman/8.0/en/group-by-optimization.html) No other index can improve performance. – mikep Apr 01 '19 at 14:48
20

Normally MySQL allows group by ascending order records only. So we can order records before grouping.

SELECT *
FROM (
  SELECT *
  FROM questions
  ORDER BY id DESC
) AS questions
GROUP BY questions.asker
Kapil dev
  • 223
  • 2
  • 7
  • MySQL supports `GROUP BY col1 ASC/DESC`. But this syntax was removed in MySQL 8.0 and alternative is `GROUP BY col1 ORDER BY col1 ASC/DESC`. See my answer. – mikep Apr 01 '19 at 11:26
  • That does not work if you want to order by a different column than used for the grouping. – JuliSmz Dec 29 '21 at 21:54
6

Im writing this answer because @Taryn's first/shorter alternative in accepted answer works only if you are exactly selecting just columns used in GROUP BY and MAX. User asking question is selecting all columns in table (he used SELECT *). So when you add another 3rd column to table, that column value in query result will be incorrect. You will get mixed values from different table rows. @Taryn's second/longer alternative (using inner join and subquery) works but query is uselessly complicated and is 5 times slower in my use case than my simple alternative below.


Consider table questions:

id | asker 
-----------
1  | Bob
2  | Bob
3  | Marley

Query SELECT max(id) as id, asker FROM questions GROUP BY asker ORDER BY id DESC returns expected:

id | asker 
-----------
3  | Marley
2  | Bob

Now consider another table questions:

id | asker  | other
-------------------
1  | Bob    | 1st
2  | Bob    | 2nd
3  | Marley | 3rd

Query SELECT max(id) as id, asker, other FROM questions GROUP BY asker ORDER BY id DESC returns unexpected:

id | asker  | other
-------------------
3  | Marley | 3rd
2  | Bob    | 1st

... note that value of other for second row of result is incorrect because id=2 comes from second row of table but other=1st comes from first row of table! That is way many users in comments of Taryn's answer reports that this solution does not work.


Possible simple solution when selecting also another columns is to use GROUP BY + DESC:

SELECT id, asker, other FROM questions GROUP BY asker DESC

id | asker  | other
-------------------
3  | Marley | 3rd
2  | Bob    | 2nd

(see demo: https://www.db-fiddle.com/f/esww483qFQXbXzJmkHZ8VT/10)

... but this simple solution has some limitations:

  • Table must be InnoDB (I think it is not problem because you will get better performance and also since MySQL >= 5.5.5 default/preferred storage engine was changed from MyISAM to InnoDB)
  • You have to create index for column which is used in GROUP BY - so asker in this case (I think it is not problem because you will get better performance since index is suitable in this case. GROUP BY usually needs creation of tmp table but when index is available tmp table will not be created which is faster)
  • For MySQL 5.7 and 8.0 is needed to disable SQL mode ONLY_FULL_GROUP_BY (e.g SET SESSION sql_mode = '';) or use ANY_VALUE() on selected columns which are not aggregated to avoid error ER_WRONG_FIELD_WITH_GROUP.
  • Unfortunately MySQL developers removed support of ASC/DESC withing GROUP BY since MySQL 8.0 https://dev.mysql.com/worklog/task/?id=8693 but fortunately there is alternative GROUP BY col1 ORDER BY col1 ASC/DESC:

SELECT id, asker, other FROM questions GROUP BY asker ORDER BY asker DESC

id | asker  | other
-------------------
3  | Marley | 3rd
2  | Bob    | 2nd

(see demo: https://www.db-fiddle.com/f/esww483qFQXbXzJmkHZ8VT/11)

... result is the same as above with GROUP BY ... DESC (do not forget to use InnoDB and create index).

mikep
  • 5,880
  • 2
  • 30
  • 37
  • The explanation was good, but this doesn't seem to work for me. Given is my query where there are 4 lobCodes - AAA, BBB, BBB, AAA - and creator are - 1, 1, 2, 2. I needed answer for BBB with creator as 2 but still got 1. `SELECT lobCode, createdBy FROM pricing_master GROUP BY lobCode ORDER BY lobCode DESC` – Chetan Oswal Feb 26 '20 at 12:05
  • Please provide table structure, table data, mysql version etc.. Use https://www.db-fiddle.com/ I will check it. – mikep Feb 27 '20 at 13:07
5

The records need to be grouped using GROUP BY and MAX() to get the maximum ID for every asker.

SELECT  asker, MAX(ID) ID
FROM    TableName
GROUP   BY asker

OUTPUT

╔════════╦════╗
║ ASKER  ║ ID ║
╠════════╬════╣
║ Bob    ║  2 ║
║ Marley ║  3 ║
╚════════╩════╝
John Woo
  • 258,903
  • 69
  • 498
  • 492
3

The others are correct about using MAX(ID) to get the results you want. If you're wondering why your query doesn't work, it's because ORDER BY happens after the GROUP BY.

Michael L.
  • 620
  • 3
  • 17
2

It's because ORDER BY is performed AFTER GROUP BY.

Try this:

SELECT * FROM questions
WHERE id IN 
(
    SELECT max(id) as id
    FROM questions 
    GROUP by asker 
    ORDER by id DESC
)
evilReiko
  • 19,501
  • 24
  • 86
  • 102
1

To get every column:

SELECT * FROM questions
WHERE id IN 
(SELECT max(id) as id, asker
FROM questions 
GROUP by asker 
ORDER by id DESC)

Improved version of the answer of @bluefeet.

Olimjon
  • 158
  • 1
  • 9