309

What's the simplest SQL statement that will return the duplicate values for a given column and the count of their occurrences in an Oracle database table?

For example: I have a JOBS table with the column JOB_NUMBER. How can I find out if I have any duplicate JOB_NUMBERs, and how many times they're duplicated?

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
Andrew
  • 12,991
  • 15
  • 55
  • 85
  • 1
    other solutions http://stackoverflow.com/questions/4522431/find-duplicate-entries-in-a-column/4522474#4522474 – zloctb Aug 19 '15 at 05:56

13 Answers13

684

Aggregate the column by COUNT, then use a HAVING clause to find values that appear more than once.

SELECT column_name, COUNT(column_name)
FROM table_name
GROUP BY column_name
HAVING COUNT(column_name) > 1;
Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
  • 1
    Thanks - that's the answer I just found and you beat me to posting it back here! :o) – Andrew Sep 12 '08 at 15:19
  • 3
    You're welcome. Now I'm about to post my own question on the differences between count(column) and count(*). :) – Bill the Lizard Sep 12 '08 at 15:23
  • 52
    +1 over 4 years later, still works well, and can be adjusted for selecting multiple columns as long as those are also in the `group by`, as in: `select column_one, column_two, count(*) from tablename group by column_one, column_two having count(column_one) > 1;` etc. – Amos M. Carpenter Sep 24 '12 at 02:19
  • 5
    or even `having count(*) > 1` :D – Stanislav Mamontov Mar 23 '15 at 08:43
  • 4
    +1 over 8 year later, still works well for both latest versions of Oracle and MySQL (remove space after count function in having line). – PhatHV Apr 19 '16 at 02:34
64

Another way:

SELECT *
FROM TABLE A
WHERE EXISTS (
  SELECT 1 FROM TABLE
  WHERE COLUMN_NAME = A.COLUMN_NAME
  AND ROWID < A.ROWID
)

Works fine (quick enough) when there is index on column_name. And it's better way to delete or update duplicate rows.

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
Grrey
  • 671
  • 1
  • 6
  • 4
  • 3
    +1 works well for multi-column duplicates (e.g. when you want to add a UNIQUE constraint on several columns), I found this approach less "rigid" than the GROUP BY one to list the duplicate field values + other fields if necessary. – Maxime Pacary Jan 27 '12 at 15:05
  • 3
    Just to clarify, (this wasn't obvious to me at first) this query returns only the duplicates, it does not return the first original entry, which is why it works well for deleting the duplicates, based on a unique constraint across more than 1 column. You can select the duplicate IDs with this query, and then use those to delete the duplicates. – matthewb Nov 21 '12 at 18:26
  • 1
    if you change < to != you will get all the records that duplicate. not just the 2nd or 3rd record – moore1emu Jan 04 '19 at 19:51
36

Simplest I can think of:

select job_number, count(*)
from jobs
group by job_number
having count(*) > 1;
JosephStyons
  • 57,317
  • 63
  • 160
  • 234
17

You don't need to even have the count in the returned columns if you don't need to know the actual number of duplicates. e.g.

SELECT column_name
FROM table
GROUP BY column_name
HAVING COUNT(*) > 1
Evan
  • 18,183
  • 8
  • 41
  • 48
7

How about:

SELECT <column>, count(*)
FROM <table>
GROUP BY <column> HAVING COUNT(*) > 1;

To answer the example above, it would look like:

SELECT job_number, count(*)
FROM jobs
GROUP BY job_number HAVING COUNT(*) > 1;
Andrew
  • 12,991
  • 15
  • 55
  • 85
6

In case where multiple columns identify unique row (e.g relations table ) there you can use following

Use row id e.g. emp_dept(empid, deptid, startdate, enddate) suppose empid and deptid are unique and identify row in that case

select oed.empid, count(oed.empid) 
from emp_dept oed 
where exists ( select * 
               from  emp_dept ied 
                where oed.rowid <> ied.rowid and 
                       ied.empid = oed.empid and 
                      ied.deptid = oed.deptid )  
        group by oed.empid having count(oed.empid) > 1 order by count(oed.empid);

and if such table has primary key then use primary key instead of rowid, e.g id is pk then

select oed.empid, count(oed.empid) 
from emp_dept oed 
where exists ( select * 
               from  emp_dept ied 
                where oed.id <> ied.id and 
                       ied.empid = oed.empid and 
                      ied.deptid = oed.deptid )  
        group by oed.empid having count(oed.empid) > 1 order by count(oed.empid);
Jitendra Vispute
  • 709
  • 8
  • 18
5

I usually use Oracle Analytic function ROW_NUMBER().

Say you want to check the duplicates you have regarding a unique index or primary key built on columns (c1, c2, c3). Then you will go this way, bringing up ROWID s of rows where the number of lines brought by ROW_NUMBER() is >1:

Select *
From Table_With_Duplicates
Where Rowid In (Select Rowid
                  From (Select ROW_NUMBER() Over (
                                 Partition By c1, c2, c3
                                 Order By c1, c2, c3
                               ) nbLines
                          From Table_With_Duplicates) t2
                 Where nbLines > 1)
MT0
  • 143,790
  • 11
  • 59
  • 117
J. Chomel
  • 8,193
  • 15
  • 41
  • 69
4
SELECT   SocialSecurity_Number, Count(*) no_of_rows
FROM     SocialSecurity 
GROUP BY SocialSecurity_Number
HAVING   Count(*) > 1
Order by Count(*) desc 
Simon Adcock
  • 3,554
  • 3
  • 25
  • 41
4

Doing

select count(j1.job_number), j1.job_number, j1.id, j2.id
from   jobs j1 join jobs j2 on (j1.job_numer = j2.job_number)
where  j1.id != j2.id
group by j1.job_number

will give you the duplicated rows' ids.

agnul
  • 12,608
  • 14
  • 63
  • 85
3

I know its an old thread but this may help some one.

If you need to print other columns of the table while checking for duplicate use below:

select * from table where column_name in
(select ing.column_name from table ing group by ing.column_name having count(*) > 1)
order by column_name desc;

also can add some additional filters in the where clause if needed.

Parth Kansara
  • 189
  • 1
  • 11
1

Here is an SQL request to do that:

select column_name, count(1)
from table
group by column_name
having count (column_name) > 1;
typedef
  • 1,159
  • 1
  • 6
  • 11
0

1. solution

select * from emp
    where rowid not in
    (select max(rowid) from emp group by empno);
DoOrDie
  • 315
  • 3
  • 12
-1

Also u can try something like this to list all duplicate values in a table say reqitem

SELECT count(poid) 
FROM poitem 
WHERE poid = 50 
AND rownum < any (SELECT count(*)  FROM poitem WHERE poid = 50) 
GROUP BY poid 
MINUS
SELECT count(poid) 
FROM poitem 
WHERE poid in (50)
GROUP BY poid 
HAVING count(poid) > 1;
Yaron Idan
  • 6,207
  • 5
  • 44
  • 66
Stacker
  • 49
  • 1
  • 4