How to DELETE duplicate records in sql

Question

Possible Duplicate:
How to delete duplicate rows with SQL?

I have a table with records and I want to delete all duplicate records

DELETE FROM 'table'
WHERE 'field' IN 
(
SELECT 'field' FROM 'table' GROUP BY 'field'
HAVING (COUNT('field')>1)
)

Why isn't this working?

Did you get an error message or did the query run successfully? — cms_mgr, Jan 31 '13 at 14:56
wouldnt your query delete all the fields including duplicates? if you have 2 duplicates with that ID your IN() query will delete both. Dont you wanna keep at least one record? — GGio, Jan 31 '13 at 14:58
mysql doesn't let you update/delete on a table from which you're selecting data. — Marc B, Jan 31 '13 at 14:59
Table names must not be quoted with single quotes. `'table_name'` is a string literal. `table_name` is the name of a table. — , Jan 31 '13 at 15:03
can you add sample records? do you have auto_incremented columns? — John Woo, Jan 31 '13 at 15:05
thank a_horse_with_no_name, but now I get another error... #1093 - You can't specify target table 'table' for update in FROM clause — vdhmartijn, Jan 31 '13 at 15:07
Once you're done (with whatever answer works) make sure you add a `UNIQUE` constraint so you don't have to do this again in the future. — Damien_The_Unbeliever, Jan 31 '13 at 15:22

score 6 · Accepted Answer · answered Jan 31 '13 at 15:10

6

Maybe you can explore with the command DISTINCT to select only unique records based in a field.

You can create a new table with the unique entries based. As an example...

CREATE TABLE nonDuplicates  
SELECT DISTINCT * FROM yourTable group by field

answered Jan 31 '13 at 15:10

peixe

1,272
3
14
31

1

This will only work because MySQL's handling of GROUP BY is basically incorrect and allows for this invalid syntax. But it is most probably the fastest solution. – Jan 31 '13 at 15:13
1

@vdhmartijn: please read this http://rpbouman.blogspot.de/2007/05/debunking-group-by-myths.html in order to understand *why* it's working in MySQL and why other DBMS would reject the statement as invalid. – Jan 31 '13 at 15:27

score 1 · Answer 2 · edited Jan 31 '13 at 15:04

1

This gives you more than one result:-

SELECT field FROM `table`
GROUP BY field
HAVING (COUNT('field)>1

Try to chenge this by:

SELECT TOP 1 field
FROM `table`
GROUP BY field 
HAVING (COUNT(field)>1

edited Jan 31 '13 at 15:04

answered Jan 31 '13 at 14:56

Andrew Adamich

707
4
8

SELECT TOP 1 doesn't work for me.. still not working – vdhmartijn Jan 31 '13 at 15:04
How does this delete anything? – Jan 31 '13 at 15:10

score 0 · Answer 3 · answered Jan 31 '13 at 15:09

0

First get all duplicates and store them in array.

SELECT field 
FROM `table`
GROUP BY field
HAVING COUNT('field') > 1

now do your php to save one result whichever you want and then execute

DELETE 
FROM `table`
WHERE field IN (your values) AND field != savedID

answered Jan 31 '13 at 15:09

GGio

7,563
11
44
81

Thanks for you reaction, but my intention is to work without PHP only SQL – vdhmartijn Jan 31 '13 at 15:11

score 0 · Answer 4 · 2013-01-31T15:21:31.730

MySQL has a very obnoxious restriction that cannot use the table that is being updated/deleted/inserted in a sub-select.

But you can work around this by joining the table to be deleted (instead of using a sub-select).

Assuming you have some kind of unique identifier in your table (I assume a column id in the following statement):

DELETE d 
FROM table_with_duplicates d 
JOIN ( 
   SELECT min(id) as min_id, field
   FROM table_with_duplicates 
   GROUP BY field
) keep ON keep.field = d.field
      AND keep.min_id <> d.id;

This will keep one row for each of the duplicates (the one with the lowes value in the id column).

If you want to delete all duplicate rows (not keeping at least one), simply remove the AND keep.min_id <> d.id condition.

Edit

If you don't have a unique column, but want to remove all duplicates (not keeping at least one row), then you can use:

DELETE d 
FROM table_with_duplicates d 
JOIN ( 
   SELECT field
   FROM table_with_duplicates 
   GROUP BY field
   HAVING count(*) > 1
) del ON del.field = d.field;

And if I do not have an ID is there still a way to solve this? — vdhmartijn, Jan 31 '13 at 15:18
@vdhmartijn: If you want to delete **all** duplicates (not keeping at least one), then you can do without the ID. Otherwise see peixe's answer. — , Jan 31 '13 at 15:19

How to DELETE duplicate records in sql

4 Answers4