Deleting while using "having" in the query

Question

Let's say we have the following SQL query that returns duplicates of emails

SELECT email, COUNT(email) 
FROM users
GROUP BY email
HAVING ( COUNT(email) > 1 )

In the case above, how can we actually delete those duplicate rows? or only one of the duplicates so they are no longer duplicates?

Assuming your primary key is a `user` how would you identify which `user` you would want to delete of the two? — JNevill, Apr 11 '16 at 20:23
Possible duplicate of [How to delete duplicate entries?](http://stackoverflow.com/questions/1746213/how-to-delete-duplicate-entries) — Vamsi Prabhala, Apr 11 '16 at 20:25

Gordon Linoff · Answer 1 · 2016-04-12T02:02:30.943

3

One method uses ctid:

delete from users
    where ctid not in (select min(ctid)
                       from users
                       group by email
                      );

This deletes all but one row for each email. ctid is an internal row identifier. It would be better to use a user-defined primary key column.

edited Apr 12 '16 at 02:02

answered Apr 11 '16 at 20:25

Gordon Linoff

1,242,037
58
646
786

*`delete from users` – JNevill Apr 11 '16 at 20:27
@Abelisto . . . I don't see how it would do that. – Gordon Linoff Apr 12 '16 at 02:02
@Abelisto . . . This should delete all rows but one for each email. – Gordon Linoff Apr 13 '16 at 02:06

score 2 · Answer 2 · answered Apr 11 '16 at 20:28

This method below will allow you to remove the records from your table even if you don't have a primary key or unique identifier.

WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY email ORDER BY email) AS RN
FROM users
)

DELETE FROM CTE WHERE RN > 1

Deleting while using "having" in the query

2 Answers2