Finding duplicate values in a SQL table

Question

It's easy to find duplicates with one field:

SELECT email, COUNT(email) 
FROM users
GROUP BY email
HAVING COUNT(email) > 1

So if we have a table

ID   NAME   EMAIL
1    John   asd@asd.com
2    Sam    asd@asd.com
3    Tom    asd@asd.com
4    Bob    bob@asd.com
5    Tom    asd@asd.com

This query will give us John, Sam, Tom, Tom because they all have the same email.

However, what I want is to get duplicates with the same email and name.

That is, I want to get "Tom", "Tom".

The reason I need this: I made a mistake, and allowed inserting duplicate name and email values. Now I need to remove/change the duplicates, so I need to find them first.

I don't think it would let you select name in your first sample since it's not in an aggregate function. "What is the count of matching email addresses and their name" is some tricky logic... — sXe, Jan 04 '13 at 18:09
Found that this doesn't work with MSSQL server because of the `name` field in the SELECT. — E. van Putten, Nov 08 '18 at 09:06

score 3643 · Accepted Answer · edited Jun 01 '18 at 11:59

3643

SELECT
    name, email, COUNT(*)
FROM
    users
GROUP BY
    name, email
HAVING 
    COUNT(*) > 1

Simply group on both of the columns.

Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of "functional dependency":

In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.

Support is not consistent:

Recent PostgreSQL supports it.
SQL Server (as at SQL Server 2017) still requires all non-aggregated columns in the GROUP BY.
MySQL is unpredictable and you need sql_mode=only_full_group_by:
- GROUP BY lname ORDER BY showing wrong results;
- Which is the least expensive aggregate function in the absence of ANY() (see comments in accepted answer).
Oracle isn't mainstream enough (warning: humour, I don't know about Oracle).

edited Jun 01 '18 at 11:59

Andriy M

76,112
17
94
154

answered Apr 07 '10 at 18:20

gbn

422,506
82
585
676

107

@webXL WHERE works with single record HAVING works with group – bjan Apr 11 '13 at 04:27
9

@gbn Is it possible to include the Id in the results? Then it would be easier to delete those duplicates afterwards. – user797717 Jun 10 '14 at 09:36
15

@user797717: you'd need to have MIN(ID) and then delete for ID values not in the last if MIN(ID) values – gbn Jun 10 '14 at 09:59
2

What about cases where any of the columns have null values? – Ankit Dhingra Sep 16 '16 at 06:38
@AnkitDhingra The NULL values are grouped like "bob" or "gbn". No special behaviour – gbn Sep 16 '16 at 09:58
3

Thanks so much for this, and yes it does work in Oracle, though I needed uniqueness of the condition, so rather than `>1` `=1` – Bill Naylor Oct 07 '19 at 16:17
To find duplicate value count: SELECT COUNT(*) FROM table where value = 'value1' – Rudrika Jan 30 '20 at 06:45
@gbn . In this way of writing query we are able to find the duplicate name and email in users table . what if we want to find the IDs of users who have duplicate name and email id . Can you please write a query to find IDs of users who have duplicate name and email id ? – RajGopalbh4 Apr 28 '20 at 05:25
I seriously come back to this post at least twice a week. going to have to start using my memory more. Thanks. – AtLeastTheresToast Jul 29 '20 at 19:05
Thanks for it. I need to find duplicate row that having several same fields value. – Zhou Haibo Aug 26 '20 at 05:31
what if I want the total amount of the result entries ? – Berni Nov 20 '20 at 18:51
@gbn can you give some example of the question of @user797717? – maniootek Dec 21 '20 at 09:09
See @gaurav-singh's answer below with `row_number()` to see whole record (all fields). – Hans Ginzel Jun 18 '21 at 16:42
@BillNaylor The functional dependency is not implemented by Oracle 19: if you write `group by id` instead of `group by name, email` it is rejected by Oracle. This is actually explained in the blog post linked, so I don't know why Andry said he didn't know for Oracle. – Johan Boulé Apr 26 '23 at 16:00

score 436 · Answer 2 · edited Apr 10 '18 at 08:46

try this:

declare @YourTable table (id int, name varchar(10), email varchar(50))

INSERT @YourTable VALUES (1,'John','John-email')
INSERT @YourTable VALUES (2,'John','John-email')
INSERT @YourTable VALUES (3,'fred','John-email')
INSERT @YourTable VALUES (4,'fred','fred-email')
INSERT @YourTable VALUES (5,'sam','sam-email')
INSERT @YourTable VALUES (6,'sam','sam-email')

SELECT
    name,email, COUNT(*) AS CountOf
    FROM @YourTable
    GROUP BY name,email
    HAVING COUNT(*)>1

OUTPUT:

name       email       CountOf
---------- ----------- -----------
John       John-email  2
sam        sam-email   2

(2 row(s) affected)

if you want the IDs of the dups use this:

SELECT
    y.id,y.name,y.email
    FROM @YourTable y
        INNER JOIN (SELECT
                        name,email, COUNT(*) AS CountOf
                        FROM @YourTable
                        GROUP BY name,email
                        HAVING COUNT(*)>1
                    ) dt ON y.name=dt.name AND y.email=dt.email

OUTPUT:

id          name       email
----------- ---------- ------------
1           John       John-email
2           John       John-email
5           sam        sam-email
6           sam        sam-email

(4 row(s) affected)

to delete the duplicates try:

DELETE d
    FROM @YourTable d
        INNER JOIN (SELECT
                        y.id,y.name,y.email,ROW_NUMBER() OVER(PARTITION BY y.name,y.email ORDER BY y.name,y.email,y.id) AS RowRank
                        FROM @YourTable y
                            INNER JOIN (SELECT
                                            name,email, COUNT(*) AS CountOf
                                            FROM @YourTable
                                            GROUP BY name,email
                                            HAVING COUNT(*)>1
                                        ) dt ON y.name=dt.name AND y.email=dt.email
                   ) dt2 ON d.id=dt2.id
        WHERE dt2.RowRank!=1
SELECT * FROM @YourTable

OUTPUT:

id          name       email
----------- ---------- --------------
1           John       John-email
3           fred       John-email
4           fred       fred-email
5           sam        sam-email

(4 row(s) affected)

* Table names are case sensitivearray(3) { [0]=> string(5) "42000" [1]=> int(1064) [2]=> string(226) "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '(PARTITION BY y.employee_id, y.leave_type_id ) AS RowRank ' at line 1" } — JAMZAD, Dec 23 '20 at 10:53

score 153 · Answer 3 · answered Apr 07 '10 at 18:20

153

Try this:

SELECT name, email
FROM users
GROUP BY name, email
HAVING ( COUNT(*) > 1 )

answered Apr 07 '10 at 18:20

Chris Van Opstal

36,423
9
73
90

Tancrede Chazallet · Answer 4 · 2016-12-30T12:37:35.720

100

If you want to delete the duplicates, here's a much simpler way to do it than having to find even/odd rows into a triple sub-select:

SELECT id, name, email 
FROM users u, users u2
WHERE u.name = u2.name AND u.email = u2.email AND u.id > u2.id

And so to delete:

DELETE FROM users
WHERE id IN (
    SELECT id/*, name, email*/
    FROM users u, users u2
    WHERE u.name = u2.name AND u.email = u2.email AND u.id > u2.id
)

Much more easier to read and understand IMHO

Note: The only issue is that you have to execute the request until there is no rows deleted, since you delete only 1 of each duplicate each time

edited Dec 30 '16 at 12:37

answered Mar 14 '16 at 14:22

Tancrede Chazallet

7,035
6
41
62

3

Nice and easy to read; I'd like to find a way that deleted multiple duplicate rows in one go though. – Dickon Reed Apr 09 '16 at 05:29
1

This doesn't work for me as I get `You can't specify target table 'users' for update in FROM clause` – Whitecat Jan 17 '17 at 17:55
1

@Whitecat seems like a simple MySQL problem: http://stackoverflow.com/questions/4429319/you-cant-specify-target-table-for-update-in-from-clause – Tancrede Chazallet Jan 17 '17 at 23:12
1

Fails for me. I get: "DBD::CSV::st execute failed: Use of uninitialized value $_[1] in hash element at /Users/hornenj/perl5/perlbrew/perls/perl-5.26.0/lib/site_perl/5.26.0/SQL/Eval.pm line 43" – Nigel Horne Aug 28 '17 at 14:31
1

I think that where clause should be " u.name = u2.name AND u.email = u2.email AND (u.id > u2.id OR u2.id > u.id)" isn't it? – GiveEmTheBoot Feb 18 '19 at 16:15
@GiveEmTheBoot since u and u2 are the same tables, it doesn't really matter, but yeah, I guess doing `u.name = u2.name AND u.email = u2.email AND u.id <> u2.id` would work just as well – Tancrede Chazallet Feb 25 '19 at 11:47
1

@GiveEmTheBoot your adding extra complication really. I would just keep it the way AncAinu has put it. It really doesn't matter. Also, I just want to point out - props to AncAinu, this is really a genius explanation. If people have their tables laid out correctly with IDs generated for each record this works flawlessly. Great, and easy to read and maintain! – Yusha Apr 15 '19 at 18:17

score 64 · Answer 5 · edited Jun 19 '21 at 11:09

In contrast to other answers you can view the whole records containing all columns if there are any. In the PARTITION BY part of row_number function choose the desired unique/duplicit columns.

SELECT  *
FROM    (
 SELECT a.*
 ,      Row_Number() OVER (PARTITION BY Name, Age ORDER BY Name) AS r
 FROM   Customers AS a
)       AS b
WHERE   r > 1;

When you want to select ALL duplicated records with ALL fields you can write it like

CREATE TABLE test (
        id      bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY
,       c1      integer
,       c2      text
,       d       date DEFAULT now()
,       v       text
);

INSERT INTO test (c1, c2, v) VALUES
(1, 'a', 'Select'),
(1, 'a', 'ALL'),
(1, 'a', 'multiple'),
(1, 'a', 'records'),
(2, 'b', 'in columns'),
(2, 'b', 'c1 and c2'),
(3, 'c', '.');
SELECT * FROM test ORDER BY 1;

SELECT  *
FROM    test
WHERE   (c1, c2) IN (
 SELECT c1, c2
 FROM   test
 GROUP  BY 1,2
 HAVING count(*) > 1
)
ORDER   BY 1;

Tested in PostgreSQL.

A slight change to SELECT * helped me solved an hour search. I have never used the OVER(PARTITION BY before. I never cease to be amazed at how many ways to do the same thing in SQL! — Joe Ruder, Sep 14 '16 at 00:42

score 39 · Answer 6 · edited Jul 22 '15 at 07:40

39

 SELECT name, email 
    FROM users
    WHERE email in
    (SELECT email FROM users
    GROUP BY email 
    HAVING COUNT(*)>1)

edited Jul 22 '15 at 07:40

Moyed Ansari

8,436
2
36
57

answered Jul 22 '15 at 07:12

PRADEEPTA VIRLLEY

417
4
2

score 36 · Answer 7 · edited Jun 19 '21 at 11:11

36

A little late to the party but I found a really cool workaround to finding all duplicate IDs:

SELECT email, GROUP_CONCAT(id)
FROM   users
GROUP  BY email
HAVING COUNT(email) > 1;

edited Jun 19 '21 at 11:11

Hans Ginzel

8,192
3
24
22

answered Nov 17 '15 at 10:21

Indivision Dev

1,097
11
16

3

Appears to be a syntactical sugar work around. Nice find. – Chef_Code Jan 31 '16 at 20:36
9

Keep in mind that `GROUP_CONCAT` will stop after some predetermined length, so you might not get all the `id`s. – v010dya Dec 10 '16 at 08:39

Martin Silovský · Answer 8 · 2023-01-20T09:31:26.077

27

This selects/deletes all duplicate records except one record from each group of duplicates. So, the delete leaves all unique records + one record from each group of the duplicates.

Select duplicates:

SELECT *
FROM <table>
WHERE
    id NOT IN (
        SELECT MIN(id)
        FROM table
        GROUP BY <column1>, <column2>
);

Delete duplicates:

DELETE FROM <table>
WHERE
    id NOT IN (
        SELECT MIN(id)
        FROM table
        GROUP BY <column1>, <column2>
);

Be aware of larger amounts of records, it can cause performance problems.

edited Jan 20 '23 at 09:31

answered Feb 22 '17 at 15:02

Martin Silovský

309
3
6

2

Error in delete query - You can't specify target table 'cities' for update in FROM clause – Ali Azhar Mar 16 '18 at 09:09
2

There is neither table 'cities' nor update clause. What do you mean? Where is an error in the delete query? – Martin Silovský Apr 16 '18 at 18:12

Tanmay Nehete · Answer 9 · 2018-01-30T08:50:59.233

23

try this code

WITH CTE AS

( SELECT Id, Name, Age, Comments, RN = ROW_NUMBER()OVER(PARTITION BY Name,Age ORDER BY ccn)
FROM ccnmaster )
select * from CTE

edited Jan 30 '18 at 08:50

answered Sep 13 '14 at 04:03

Tanmay Nehete

2,138
4
31
42

score 19 · Answer 10 · answered Jun 16 '14 at 08:50

In case you work with Oracle, this way would be preferable:

create table my_users(id number, name varchar2(100), email varchar2(100));

insert into my_users values (1, 'John', 'asd@asd.com');
insert into my_users values (2, 'Sam', 'asd@asd.com');
insert into my_users values (3, 'Tom', 'asd@asd.com');
insert into my_users values (4, 'Bob', 'bob@asd.com');
insert into my_users values (5, 'Tom', 'asd@asd.com');

commit;

select *
  from my_users
 where rowid not in (select min(rowid) from my_users group by name, email);

score 17 · Answer 11 · answered Nov 18 '21 at 01:00

Well this question has been answered very neatly in all the above answers. But I would like to list all the possible manners, we can do this in various ways which may impart the understanding how we can do it and seeker can pick one of the solution which best fits to his/her need as this is one of the most common query SQL developer come across different business usecases or sometime in interviews as well.

Creating Sample Data

I will start with setting up some sample data from this question only.

Create table NewTable (id int, name varchar(10), email varchar(50))
INSERT  NewTable VALUES (1,'John','asd@asd.com')
INSERT  NewTable VALUES (2,'Sam','asd@asd.com')
INSERT  NewTable VALUES (3,'Tom','asd@asd.com')
INSERT  NewTable VALUES (4,'Bob','bob@asd.com')
INSERT  NewTable VALUES (5,'Tom','asd@asd.com')

1. USING GROUP BY CLAUSE

SELECT
    name,email, COUNT(*) AS Occurence
    FROM NewTable
    GROUP BY name,email
    HAVING COUNT(*)>1

How it works:

the GROUP BY clause groups the rows into groups by values in both name and email columns.
Then, the COUNT() function returns the number of occurrences of each group (name,email).
Then, the HAVING clause keeps only duplicate groups, which are groups that have more than one occurrence.

2. Using CTE:

To return the entire row for each duplicate row, you join the result of the above query with the NewTable table using a common table expression (CTE):

WITH cte AS (
    SELECT
        name, 
        email, 
        COUNT(*) occurrences
    FROM NewTable
    GROUP BY 
        name, 
        email
    HAVING COUNT(*) > 1
)
SELECT 
    t1.Id,
    t1.name, 
    t1.email
FROM  NewTable t1
    INNER JOIN cte ON 
        cte.name = t1.name AND 
        cte.email = t1.email
ORDER BY 
    t1.name, 
    t1.email;

3. Using ROW_NUMBER() function

WITH cte AS (
    SELECT 
        name, 
        email, 
        ROW_NUMBER() OVER (
            PARTITION BY name,email
            ORDER BY name,email) rownum
    FROM 
        NewTable t1
) 
SELECT 
  * 
FROM 
    cte 
WHERE 
    rownum > 1;

How it works:

ROW_NUMBER() distributes rows of the NewTable table into partitions by values in the name and email columns. The duplicate rows will have repeated values in the name and email columns, but different row numbers
Outer query removes the first row in each group.

Well Now I believe, you can have sound Idea of how to find duplicates and apply the logic to find duplicate in all possible scenarios. Thanks.

The code in those images is redundant & the results should be text. . [Why should I not upload images of code/data/errors when asking a question?](https://meta.stackoverflow.com/q/285551/3404097) [Why are images of text, code and mathematical expressions discouraged?](https://meta.stackexchange.com/q/320052/266284) — philipxy, May 18 '22 at 04:33

score 15 · Answer 12 · answered Sep 08 '16 at 06:41

15

select name, email
, case 
when ROW_NUMBER () over (partition by name, email order by name) > 1 then 'Yes'
else 'No'
end "duplicated ?"
from users

answered Sep 08 '16 at 06:41

Narendra

345
4
17

2

Code only answers are frowned upon on Stack Overflow, could you explain why this answers the question? – Rich Benner Sep 08 '16 at 07:36
2

@RichBenner: I didn't find the response such as, each & every row in the result and which tells us which all are duplicate rows and which are not in one glance and that to not group by, because if we want to combine this query with any other query group by is not a good option. – Narendra Sep 09 '16 at 01:27
2

Adding Id to the select statement and filtering on duplicated , it give you the possibility to delete the duplicated ids and keep on of each. – Antoine Reinhold Bertrand Jul 16 '19 at 16:15

score 14 · Answer 13 · edited Sep 15 '21 at 12:24

14

I think this will help you

SELECT name, email, COUNT(* ) 
FROM users
GROUP BY name, email
HAVING COUNT(*)>1

edited Sep 15 '21 at 12:24

Suraj Rao

29,388
11
94
103

answered Sep 15 '21 at 12:24

Gaurav Kumar

331
4
5

score 12 · Answer 14 · answered Aug 26 '14 at 10:07

If you wish to see if there is any duplicate rows in your table, I used below Query:

create table my_table(id int, name varchar(100), email varchar(100));

insert into my_table values (1, 'shekh', 'shekh@rms.com');
insert into my_table values (1, 'shekh', 'shekh@rms.com');
insert into my_table values (2, 'Aman', 'aman@rms.com');
insert into my_table values (3, 'Tom', 'tom@rms.com');
insert into my_table values (4, 'Raj', 'raj@rms.com');


Select COUNT(1) As Total_Rows from my_table 
Select Count(1) As Distinct_Rows from ( Select Distinct * from my_table) abc

score 12 · Answer 15 · edited May 08 '15 at 07:10

12

SELECT id, COUNT(id) FROM table1 GROUP BY id HAVING COUNT(id)>1;

I think this will work properly to search repeated values in a particular column.

edited May 08 '15 at 07:10

Schemetrical

5,506
2
26
43

answered May 08 '15 at 06:41

user4877838

149
1
2

7

This doesn't quite add anything to [the top answer](http://stackoverflow.com/a/2594855/419956), and technically doesn't even really differ from the code OP's posted in the question. – Jeroen May 08 '15 at 07:14

Darrel Lee · Answer 16 · 2016-07-01T19:20:46.033

This is the easy thing I've come up with. It uses a common table expression (CTE) and a partition window (I think these features are in SQL 2008 and later).

This example finds all students with duplicate name and dob. The fields you want to check for duplication go in the OVER clause. You can include any other fields you want in the projection.

with cte (StudentId, Fname, LName, DOB, RowCnt)
as (
SELECT StudentId, FirstName, LastName, DateOfBirth as DOB, SUM(1) OVER (Partition By FirstName, LastName, DateOfBirth) as RowCnt
FROM tblStudent
)
SELECT * from CTE where RowCnt > 1
ORDER BY DOB, LName

score 12 · Answer 17 · edited Jan 08 '20 at 18:16

12

select id,name,COUNT(*) from user group by Id,Name having COUNT(*)>1

edited Jan 08 '20 at 18:16

Jean-François Fabre

137,073
23
153
219

answered Sep 12 '16 at 18:18

Debendra Dash

5,334
46
38

score 11 · Answer 18 · edited Oct 15 '14 at 16:41

11

 select emp.ename, emp.empno, dept.loc 
          from emp
 inner join dept 
          on dept.deptno=emp.deptno
 inner join
    (select ename, count(*) from
    emp
    group by ename, deptno
    having count(*) > 1)
 t on emp.ename=t.ename order by emp.ename
/

edited Oct 15 '14 at 16:41

kapex

28,903
6
107
121

answered Oct 15 '14 at 15:38

naveed

131
1
2

score 10 · Answer 19 · edited Dec 11 '14 at 10:32

10

How we can count the duplicated values?? either it is repeated 2 times or greater than 2. just count them, not group wise.

as simple as

select COUNT(distinct col_01) from Table_01

edited Dec 11 '14 at 10:32

SysDragon

9,692
15
60
89

answered Dec 11 '14 at 10:28

Muhammad Tahir

325
3
17

2

How would this work for the question as asked? This does *not* give rows that duplicate information in multiple columns (e.g. "email" and "name") in different rows. – Jeroen May 08 '15 at 07:18

score 10 · Answer 20 · answered Sep 26 '16 at 12:23

10

By Using CTE also we can find duplicate value like this

with MyCTE
as
(
select Name,EmailId,ROW_NUMBER() over(PARTITION BY EmailId order by id) as Duplicate from [Employees]

)
select * from MyCTE where Duplicate>1

answered Sep 26 '16 at 12:23

Debendra Dash

5,334
46
38

score 7 · Answer 21 · answered Apr 14 '16 at 23:02

This should also work, maybe give it try.

  Select * from Users a
            where EXISTS (Select * from Users b 
                where (     a.name = b.name 
                        OR  a.email = b.email)
                     and a.ID != b.id)

Especially good in your case If you search for duplicates who have some kind of prefix or general change like e.g. new domain in mail. then you can use replace() at these columns

score 7 · Answer 22 · answered Jul 22 '16 at 20:29

7

SELECT * FROM users u where rowid = (select max(rowid) from users u1 where
u.email=u1.email);

answered Jul 22 '16 at 20:29

Panky031

425
1
5
14

score 6 · Answer 23 · edited Feb 17 '20 at 19:50

6

SELECT name, email,COUNT(email) 
FROM users 
WHERE email IN (
    SELECT email 
    FROM users 
    GROUP BY email 
    HAVING COUNT(email) > 1)

edited Feb 17 '20 at 19:50

Software Engineer

15,457
7
74
102

answered Sep 12 '19 at 17:09

Mohammad Neamul Islam

283
6
6

You can't use `COUNT` without `GROUP BY`, unless it refers to the whole table. – RalfFriedl Sep 12 '19 at 17:31
Without Group By you used COUNT but here i have doing a typing mistake to write COUNT – Mohammad Neamul Islam Sep 12 '19 at 18:00

RyanAbnavi · Answer 24 · 2020-05-28T20:34:44.833

The most important thing here is to have the fastest function. Also indices of duplicates should be identified. Self join is a good option but to have a faster function it is better to first find rows that have duplicates and then join with original table for finding id of duplicated rows. Finally order by any column except id to have duplicated rows near each other.

SELECT u.*
FROM users AS u
JOIN (SELECT username, email
      FROM users
      GROUP BY username, email
      HAVING COUNT(*)>1) AS w
ON u.username=w.username AND u.email=w.email
ORDER BY u.email;

score 5 · Answer 25 · answered Jan 01 '15 at 14:43

If you want to find duplicate data (by one or several criterias) and select the actual rows.

with MYCTE as (
    SELECT DuplicateKey1
        ,DuplicateKey2 --optional
        ,count(*) X
    FROM MyTable
    group by DuplicateKey1, DuplicateKey2
    having count(*) > 1
) 
SELECT E.*
FROM MyTable E
JOIN MYCTE cte
ON E.DuplicateKey1=cte.DuplicateKey1
    AND E.DuplicateKey2=cte.DuplicateKey2
ORDER BY E.DuplicateKey1, E.DuplicateKey2, CreatedAt

http://developer.azurewebsites.net/2014/09/better-sql-group-by-find-duplicate-data/

score 3 · Answer 26 · edited Jan 10 '19 at 15:06

3

To delete records whose names are duplicate

;WITH CTE AS    
(

    SELECT ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) AS T FROM     @YourTable    
)

DELETE FROM CTE WHERE T > 1

edited Jan 10 '19 at 15:06

Mightee

689
7
22

answered Jan 10 '19 at 12:46

Sheriff

738
10
20

Does it work? How comes I get this error 'relation "cte" does not exist' in Postgres? – newman Jan 22 '21 at 02:45
CTE works also in postgress sql..Here is the link https://www.postgresqltutorial.com/postgresql-cte/ You must be missing something else. – Sheriff Jan 23 '21 at 15:27

score 3 · Answer 27 · answered Mar 18 '19 at 17:32

To Check From duplicate Record in a table.

select * from users s 
where rowid < any 
(select rowid from users k where s.name = k.name and s.email = k.email);

or

select * from users s 
where rowid not in 
(select max(rowid) from users k where s.name = k.name and s.email = k.email);

To Delete the duplicate record in a table.

delete from users s 
where rowid < any 
(select rowid from users k where s.name = k.name and s.email = k.email);

or

delete from users s 
where rowid not in 
(select max(rowid) from users k where s.name = k.name and s.email = k.email);

The AG · Answer 28 · 2020-11-19T15:56:25.460

3

Another easy way you can try this using analytic function as well:

SELECT * from 

(SELECT name, email,

COUNT(name) OVER (PARTITION BY name, email) cnt 

FROM users)

WHERE cnt >1;

edited Nov 19 '20 at 15:56

answered May 21 '20 at 18:55

The AG

672
9
18

1

you missed a comma after the email :) – Amal Ps Nov 18 '20 at 12:07

rahul kumar · Answer 29 · 2017-12-11T13:15:13.073

2

SELECT column_name,COUNT(*) FROM TABLE_NAME GROUP BY column1, HAVING COUNT(*) > 1;

edited Dec 11 '17 at 13:15

answered Dec 05 '17 at 12:41

rahul kumar

135
6

score 2 · Answer 30 · answered Jun 25 '19 at 16:30

2

You may want to try this

SELECT NAME, EMAIL, COUNT(*)
FROM USERS
GROUP BY 1,2
HAVING COUNT(*) > 1

answered Jun 25 '19 at 16:30

adesh

832
4
14
23

score 2 · Answer 31 · answered Apr 07 '21 at 15:01

2

Please try

SELECT UserID, COUNT(UserID) 
FROM dbo.User
GROUP BY UserID
HAVING COUNT(UserID) > 1

answered Apr 07 '21 at 15:01

Md. Nazmul Nadim

207
2
5

Suraj Kumar · Answer 32 · 2018-10-26T17:18:25.713

We can use having here which work on aggregate functions as shown below

create table #TableB (id_account int, data int, [date] date)
insert into #TableB values (1 ,-50, '10/20/2018'),
(1, 20, '10/09/2018'),
(2 ,-900, '10/01/2018'),
(1 ,20, '09/25/2018'),
(1 ,-100, '08/01/2018')  

SELECT id_account , data, COUNT(*)
FROM #TableB
GROUP BY id_account , data
HAVING COUNT(id_account) > 1

drop table #TableB

Here as two fields id_account and data are used with Count(*). So, it will give all the records which has more than one times same values in both columns.

We some reason mistakely we had missed to add any constraints in SQL server table and the records has been inserted duplicate in all columns with front-end application. Then we can use below query to delete duplicate query from table.

SELECT DISTINCT * INTO #TemNewTable FROM #OriginalTable
TRUNCATE TABLE #OriginalTable
INSERT INTO #OriginalTable SELECT * FROM #TemNewTable
DROP TABLE #TemNewTable

Here we have taken all the distinct records of the orignal table and deleted the records of original table. Again we inserted all the distinct values from new table to the original table and then deleted new table.

score 1 · Answer 33 · edited Oct 18 '20 at 13:17

Table structure:

ID   NAME   EMAIL
1    John   asd@asd.com
2    Sam    asd@asd.com
3    Tom    asd@asd.com
4    Bob    bob@asd.com
5    Tom    asd@asd.com

Solution 1:

SELECT *,
       COUNT(*)
FROM users t1
INNER JOIN users t2
WHERE t1.id > t2.id
  AND t1.name = t2.name
  AND t1.email=t2.email

Solution 2:

SELECT name,
         email,
       COUNT(*)
FROM users
GROUP BY name,
         email
HAVING COUNT(*) > 1

score 0 · Answer 34 · edited Feb 17 '20 at 01:56

The exact code would differ depending on whether you want to find duplicate rows as well or only different ids with the same email and name. If id is a primary key or otherwise has a unique constraint this distinction does not exist, but the question does not specify this. In the former case you can use code given in several other answers:

SELECT name, email, COUNT(*)
FROM users
GROUP BY name, email
HAVING COUNT(*) > 1

In the latter case you would use:

SELECT name, email, COUNT(DISTINCT id)
FROM users
GROUP BY name, email
HAVING COUNT(DISTINCT id) > 1
ORDER BY COUNT(DISTINCT id) DESC

score 0 · Answer 35 · answered Dec 16 '20 at 15:50

In case you work with Microsoft Access, this way works:

CREATE TABLE users (id int, name varchar(10), email varchar(50));

INSERT INTO users VALUES (1, 'John', 'asd@asd.com');
INSERT INTO users VALUES (2, 'Sam', 'asd@asd.com');
INSERT INTO users VALUES (3, 'Tom', 'asd@asd.com');
INSERT INTO users VALUES (4, 'Bob', 'bob@asd.com');
INSERT INTO users VALUES (5, 'Tom', 'asd@asd.com');

SELECT name, email, COUNT(*) AS CountOf
FROM users
GROUP BY name, email
HAVING COUNT(*)>1;

DELETE *
FROM users
WHERE id IN (
    SELECT u1.id 
    FROM users u1, users u2 
    WHERE u1.name = u2.name AND u1.email = u2.email AND u1.id > u2.id
);

Thanks to Tancrede Chazallet for the delete code.

score 0 · Answer 36 · answered Jun 03 '21 at 12:03

0

you use below query that i use :

   select *
        FROM TABLENAME
        WHERE PrimaryCoumnID NOT IN
        (
            SELECT MAX(PrimaryCoumnID)
            FROM  TABLENAME
            GROUP BY AnyCoumnID
        );

answered Jun 03 '21 at 12:03

imomins

24
1
4

score 0 · Answer 37 · answered Nov 20 '22 at 06:22

try this:

                DECLARE @myTable TABLE
                (
                    id INT,
                    name VARCHAR(10),
                    email VARCHAR(50)
                );

                INSERT @myTable
                VALUES
                (1, 'John', 'John-email');
                INSERT @myTable
                VALUES
                (2, 'John', 'John-email');
                INSERT @myTable
                VALUES
                (3, 'fred', 'John-email');
                INSERT @myTable
                VALUES
                (4, 'fred', 'fred-email');
                INSERT @myTable
                VALUES
                (5, 'sam', 'sam-email');
                INSERT @myTable
                VALUES
                (6, 'sam', 'sam-email');


                WITH cte
                AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS rowNum,
                           *
                    FROM @myTable)
                SELECT c1.id,
                       c1.name,
                       c1.email
                FROM cte AS c1
                WHERE 1 <
                (
                    SELECT COUNT(c2.rowNum)
                    FROM cte AS c2
                    WHERE c1.name = c2.name
                          AND c1.email = c2.email
                );

Code · Answer 38 · 2019-04-26T13:57:03.467

-1

How to get duplicate record in table

 SELECT COUNT(EmpCode),EmpCode FROM tbl_Employees WHERE Status=1 
 GROUP BY EmpCode HAVING COUNT(EmpCode) > 1

edited Apr 26 '19 at 13:57

answered Sep 27 '18 at 11:38

Code

679
5
9

score -1 · Answer 39 · answered Apr 04 '19 at 14:21

-1

You can use the SELECT DISTINCT keyword to get rid of duplicates. You can also filter by name and get everyone with that name on a table.

answered Apr 04 '19 at 14:21

Parkofadown

616
7
11

Finding duplicate values in a SQL table

39 Answers39

Creating Sample Data

1. USING GROUP BY CLAUSE

2. Using CTE:

3. Using ROW_NUMBER() function

Linked

Related