SQL to see all duplicates?

Question

I have a massive table with thousands of rows. Each row has a unique path. For example:

electronics/samsung/tv/lcd-23384
electronics/philips/tv/lcd-12ger
etc...

The problem is, a recent query was executed incorrectly, resulting in some of the fields having duplicate paths. So, what i want to know is, is there a query I can execute to show ALL the duplicates? In other words, if I have this:

ID     | PATH 
1      | path_1
2      | path_2
3      | path_3
4      | path_3
5      | path_3
6      | path_4
7      | path_4
8      | path_5
9      | path_6

I want this result:

ID     | PATH 
3      | path_3
4      | path_3
5      | path_3
6      | path_4
7      | path_4

Notice that all NON-duplicates have been removed. Ie.... these ones:

ID     | PATH 
1      | path_1
2      | path_2
8      | path_5
9      | path_6

What SQL query can accomplish this?

see this::http://stackoverflow.com/questions/688549/finding-duplicate-values-in-mysql — Sudhir Bastakoti, Jul 06 '12 at 12:31
@Sudhir: That doesn't do exactly what the OP wants, as one then needs to reexpand the groups. — eggyal, Jul 06 '12 at 12:35

eggyal · Accepted Answer · 2012-07-06T12:42:36.660

4

SELECT * FROM mytable NATURAL JOIN (
  SELECT PATH FROM mytable GROUP BY PATH HAVING COUNT(*) > 1
) dupes

See it on sqlfiddle.

To perform the update you request in the comments below:

UPDATE mytable NATURAL JOIN (
  SELECT PATH FROM mytable GROUP BY PATH HAVING COUNT(*) > 1
) dupes, (SELECT @r:=0) init
SET mytable.PATH = CONCAT(PATH, '-', @r:=@r+1);

See it on sqlfiddle.

edited Jul 06 '12 at 12:42

answered Jul 06 '12 at 12:32

eggyal

122,705
18
212
237

Awesome... is there a way to update all duplicates so the paths end in "-1", "-2", "-3" etc... ? – rockstardev Jul 06 '12 at 12:39

score 0 · Answer 2 · answered Jul 06 '12 at 12:34

0

Try this, substitute needed table and column values:

SELECT YourColumn, COUNT(*) TotalCount
  FROM YourTable
 GROUP BY YourColumn
HAVING COUNT(*) > 1
 ORDER BY COUNT(*) DESC

answered Jul 06 '12 at 12:34

Yaroslav

6,476
10
48
89

score 0 · Answer 3 · answered Jul 06 '12 at 12:35

0

You can see all duplicates by using such query

SELECT * FROM YourTable WHERE Path in
  (SELECT PATH
   FROM YourTable
   GROUP BY PATH
   HAVING COUNT(*)>1)

What kind of rows do you want to remove?

answered Jul 06 '12 at 12:35

Sir Hally

2,318
3
31
48

score 0 · Answer 4 · answered Jul 06 '12 at 12:41

It will give you all duplicate results

select column_name, count(column_name) occurrence from table_name 
group by column_name having count(column_name) > 1;

And if you want to get all non duplicate results..

select column_name, count(column_name) occurrence from table_name 
group by column_name having count(column_name) = 1;

SQL to see all duplicates?

4 Answers4