Getting MAX of column x while getting DISTINCT on column y

Question

I want to join these tables (task_id with post_id) which will return everything you see below. That's fine but what I want to do next is to remove duplicates of task_id while keeping the item that has the highest version.

I have 2 tables like so:

items

ID    title    task_id    project_id     version
1    "Test"      123         456            1
2   "Test 2"     124         456            1
3   "Test 3"     125         456            1
4    "X 3.1"     125         456           1.1
5    "X 3.2"     125         456           1.2

tasks

ID    post_id    meta_key    meta_value
1       123    _completed       0
4       124    _completed       0
5       125    _completed       0

and I have this SQL statement so far:

SELECT * 
FROM items t0 
INNER JOIN tasks AS t1 
 ON t0.task_id = t1.post_id 
WHERE t1.meta_key = '_completed' 
 AND project_id = 456

which returns:

ID    title    task_id    project_id    version   ID    post_id    meta_key    meta_value
1    "Test"      123         456           1      1       123    _completed       0
2   "Test 2"     124         456           1      1       124    _completed       0
3   "Test 3"     125         456           1      1       125    _completed       0
4    "X 3.1"     125         456          1.1     1       125    _completed       0
5    "X 3.2"     125         456          1.2     1       125    _completed       0

How can I remove duplicate task_id's but keep the highest version so the table will be:

ID    title    task_id    project_id    version   ID    post_id    meta_key    meta_value
1    "Test"      123         456           1      1       123    _completed       0
2   "Test 2"     124         456           1      1       124    _completed       0
5    "X 3.2"     125         456          1.2     1       125    _completed       0

Basically I am closest by ORDERing them BY version and the using GROUP BY task_id but apparently SQL does not let you do this.

Can anyone think of a good way to achieve this?

This question is asked and answered endlessly on SO. A small number of those answers are correct. Plus, the topic is covered in some depth within the manual. That said, if you're still struggling, see http://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query — Strawberry, Nov 16 '16 at 07:15
Try having `MAX(version)` in `SELECT` statement and have `task_id` in `GROUP BY` clause. — Viki888, Nov 16 '16 at 07:18

score 0 · Accepted Answer · edited May 23 '17 at 12:07

0

Based on a related post I think this should work.

SELECT *
FROM items t0
INNER JOIN tasks AS t1 ON t0.task_id = t1.post_id
INNER JOIN
    (SELECT task_id, MAX(version) AS MaxVersion
    FROM items
    GROUP BY task_id) groupedt0 
ON t0.task_id = groupedt0.task_id 
AND t0.version = groupedt0.MaxVersion
AND project_id = 456
AND t1.meta_key = '_completed'
ORDER BY ID ASC

See: How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?

edited May 23 '17 at 12:07

Community

1
1

answered Nov 17 '16 at 00:15

MomasVII

4,641
5
35
52

Works perfectly. Thank you! Sorry didn't think a search would find it that easily. Plus i'm pretty average at SQL at the moment. – SamohtVII Nov 17 '16 at 00:48

Getting MAX of column x while getting DISTINCT on column y

1 Answers1