select unique rows based on single distinct column

Question

I want to select rows that have a distinct email, see the example table below:

+----+---------+-------------------+-------------+
| id | title   | email             | commentname |
+----+---------+-------------------+-------------+
|  3 | test    | rob@hotmail.com   | rob         |
|  4 | i agree | rob@hotmail.com   | rob         |
|  5 | its ok  | rob@hotmail.com   | rob         |
|  6 | hey     | rob@hotmail.com   | rob         |
|  7 | nice!   | simon@hotmail.com | simon       |
|  8 | yeah    | john@hotmail.com  | john        |
+----+---------+-------------------+-------------+

The desired result would be:

+----+-------+-------------------+-------------+
| id | title | email             | commentname |
+----+-------+-------------------+-------------+
|  3 | test  | rob@hotmail.com   | rob         |
|  7 | nice! | simon@hotmail.com | simon       |
|  8 | yeah  | john@hotmail.com  | john        |
+----+-------+-------------------+-------------+

Where I don't care which id column value is returned. What would be the required SQL?

score 118 · Accepted Answer · edited Mar 16 '16 at 14:06

118

Quick one in TSQL

SELECT a.*
FROM emails a
INNER JOIN 
  (SELECT email,
    MIN(id) as id
  FROM emails 
  GROUP BY email 
) AS b
  ON a.email = b.email 
  AND a.id = b.id;

edited Mar 16 '16 at 14:06

ypercubeᵀᴹ

113,259
19
174
235

answered Nov 25 '11 at 20:51

Turbot

5,095
1
22
30

1

Wow that was fast guys!:) laptop's answer was the shortest and easiest, thanks! – Adam Nov 25 '11 at 20:58
9

The `distinct` keyword is not necessary here. Also, it seems like a join on just `id` would do the trick as well. – Adam Robinson Nov 25 '11 at 21:07
I have a huge table with primary key an aggregate of two columns, it is not working in that case – AurA Jul 06 '12 at 10:20
@downvoter , what do you mean by not working, perhaps will be another question? – Turbot Jul 06 '12 at 13:03
2

Excellent, I changed the min to max to get the last row in the duplicate instead of first – Dr. Mian Jan 07 '16 at 23:30

score 45 · Answer 2 · answered Nov 25 '11 at 20:40

45

I'm assuming you mean that you don't care which row is used to obtain the title, id, and commentname values (you have "rob" for all of the rows, but I don't know if that is actually something that would be enforced or not in your data model). If so, then you can use windowing functions to return the first row for a given email address:

select
    id,
    title,
    email,
    commentname

from
(
select 
    *, 
    row_number() over (partition by email order by id) as RowNbr 

from YourTable
) source

where RowNbr = 1

answered Nov 25 '11 at 20:40

Adam Robinson

182,639
35
285
343

2

This is the best solution, because it can apply to duplicate rows that do not have a unique identity column, or ones that do. – Antony Booth Mar 16 '15 at 18:27
....Yes this solved the issue for me....the solution above only grouped the table data together.....i.e for Microsoft SQL 2008 Server/data .........thanks Adam...... – Siwoku Adeola Jun 14 '16 at 06:28
This is a really good solution that works great for smaller tables. Is there a way to do this without having to list each column in the SELECT statement? – David Mar 04 '21 at 18:56

score 5 · Answer 3 · edited May 23 '17 at 10:31

5

If you are using MySql 5.7 or later, according to these links (MySql Official, SO QA), we can select one record per group by with out the need of any aggregate functions.

So the query can be simplified to this.

select * from comments_table group by commentname;

Try out the query in action here

edited May 23 '17 at 10:31

Community

1
1

answered Jun 09 '16 at 12:31

RamValli

4,389
2
33
45

Unfortunately, the question is tagged with tsql and sqlserver. – starwed Jul 12 '16 at 14:33
2

Even though it was the right answer to the wrong question I ended up here looking for this solution for mysql so take my updoot – Rick Kukiela May 24 '19 at 15:34
2

Nice solution deserves more respects – Kai Wang May 30 '19 at 18:19
1

didn't work with mysql Ver 8.0.29-0ubuntu0.20.04.3 for Linux on x86_64 ((Ubuntu)) – UMR May 16 '22 at 03:48

score 2 · Answer 4 · answered Nov 25 '11 at 20:43

2

Since you don't care which id to return I stick with MAX id for each email to simplify SQL query, give it a try

;WITH ue(id)
 AS
 (
   SELECT MAX(id)
   FROM table
   GROUP BY email
 )
 SELECT * FROM table t
 INNER JOIN ue ON ue.id = t.id

answered Nov 25 '11 at 20:43

sll

61,540
22
104
156

score -2 · Answer 5 · answered Jul 23 '22 at 09:57

-2

SELECT * FROM emails GROUP BY email;

answered Jul 23 '22 at 09:57

Deepak Raj

19
3

select unique rows based on single distinct column

5 Answers5

Linked