how to select rows based on distinct values of A COLUMN only

Question

I need to query a table in order to return rows, but I am not able to query the table correctly. Here is my table view:

Id                MailId          EmailAddress          Name
1                 1               a@a.com               Mr. A
2                 1               b@b.com               Mr. B
3                 1               c@c.com               Mr. C
4                 1               d@d.com               Mr. D
5                 1               a@a.com               Mr. A
6                 2               e@e.com               Mr. E
7                 2               a@a.com               Mr. A
8                 3               f@f.com               Mr. F
9                 4               d@d.com               Mr. D  
10                5               f@f.com               Mr. F
11                6               d@d.com               Mr. D

The result set should return:

Id                MailId          EmailAddress          Name
1                 1               a@a.com               Mr. A
2                 1               b@b.com               Mr. B
3                 1               c@c.com               Mr. C
4                 1               d@d.com               Mr. D
6                 2               e@e.com               Mr. E
8                 3               f@f.com               Mr. F

In other words: first, I want to select distinct e-mail addresses, and then return rows containing distinct e-mail addresses.

Note: Just using the "Distinct" keyword will not work here, as it will select distinct rows. My requirement is to select distinct email addresses, and then to select rows containing those addresses.

Edit: I cannot use the "Group By" keyword either, because for this I will also have to Group By with Id (which is the PK) and doing this will return two rows with the same EmailAddress values but with different Ids.

why don't you use only GROUP BY EmailAddress... if you group by your PK, it wont do anything because primary key is always unique, so group by statement wont do anything.. if you use group by emailAddress you get the same results as you specified... — blejzz, Sep 17 '11 at 11:42
It seems you want to select distinct EmailAddress and Name and the first matching Id, MailId for those two. Is that correct? I am also curious how the Id and MailId would be used. — Miserable Variable, Sep 17 '11 at 11:47
Grouping by primary key makes no sense. Grouping removes duplicate values. Primary keys by definition cannot have duplicates. — Thilo, Sep 18 '11 at 15:23
I think what you want is to group by email addr and name, then find the min(mailID), then the corresponding ID — Beth, Sep 20 '16 at 17:53

score 80 · Answer 1 · answered Sep 17 '11 at 11:45

80

Looking at your output maybe the following query can work, give it a try:

SELECT * FROM tablename
WHERE id IN
(SELECT MIN(id) FROM tablename GROUP BY EmailAddress)

This will select only one row for each distinct email address, the row with the minimum id which is what your result seems to portray

answered Sep 17 '11 at 11:45

danishgoel

3,650
1
18
30

1

I think you need to include the name field, not just email addr, although he doesn't have an example of it. – Beth Sep 20 '16 at 17:48

score 21 · Answer 2 · answered Sep 17 '11 at 11:47

Try this - you need a CTE (Common Table Expression) that partitions (groups) your data by distinct e-mail address, and sorts each group by ID - smallest first. Then you just select the first entry for each group - that should give you what you're looking for:

;WITH DistinctMails AS
(
    SELECT ID, MailID, EMailAddress, NAME,
        ROW_NUMBER() OVER(PARTITION BY EMailAddress ORDER BY ID) AS 'RowNum'
    FROM dbo.YourMailTable
)
SELECT *
FROM DistinctMails
WHERE RowNum = 1

This works on SQL Server 2005 and newer (you didn't mention what version you're using...)

Setmax · Answer 3 · 2014-07-26T01:06:03.970

2

use this(assume that your table name is emails):

select * from emails as a 
inner join  
(select EmailAddress, min(Id) as id from emails 
group by EmailAddress ) as b 
on a.EmailAddress = b.EmailAddress 
and a.Id = b.id

hope this help..

edited Jul 26 '14 at 01:06

answered Jul 26 '14 at 01:00

Setmax

946
6
10

score -1 · Answer 4 · edited Sep 20 '16 at 16:11

-1

I am not sure about your DBMS. So, I created a temporary table in Redshift and from my experience, I think this query should return what you are looking for:

select min(Id), distinct MailId, EmailAddress, Name
    from yourTableName
    group by MailId, EmailAddress, Name

I see that I am using a GROUP BY clause but you still won't have two rows against any particular MailId.

edited Sep 20 '16 at 16:11

Alicia Garcia-Raboso

13,193
1
43
48

answered Sep 20 '16 at 15:37

Satyarth Shankar

1
2

score -3 · Answer 5 · answered Sep 17 '11 at 11:36

-3

if you dont wanna use DISTINCT use GROUP BY

 SELECT * FROM myTABLE GROUP BY EmailAddress

answered Sep 17 '11 at 11:36

blejzz

3,349
4
39
60

1

see my comment, my answer returns the results you wanted. – blejzz Sep 17 '11 at 11:45
16

it returns error Msg 8120, Level 16, State 1, Line 1 Column 'EmailContact.RowId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. – user576510 Sep 17 '11 at 11:58

how to select rows based on distinct values of A COLUMN only

5 Answers5

Linked