23

I have a table that has a lot of duplicates in the Name column. I'd like to only keep one row for each.

The following lists the duplicates, but I don't know how to delete the duplicates and just keep one:

SELECT name FROM members GROUP BY name HAVING COUNT(*) > 1;

Thank you.

S.Lott
  • 384,516
  • 81
  • 508
  • 779
Gulbahar
  • 5,343
  • 20
  • 70
  • 93
  • Also see [Deleting duplicate rows from sqlite database](https://stackoverflow.com/q/8190541/608639). – jww Nov 02 '19 at 02:53

10 Answers10

58

See the following question: Deleting duplicate rows from a table.

The adapted accepted answer from there (which is my answer, so no "theft" here...):

You can do it in a simple way assuming you have a unique ID field: you can delete all records that are the same except for the ID, but don't have "the minimum ID" for their name.

Example query:

DELETE FROM members
WHERE ID NOT IN
(
    SELECT MIN(ID)
    FROM members
    GROUP BY name
)

In case you don't have a unique index, my recommendation is to simply add an auto-incremental unique index. Mainly because it's good design, but also because it will allow you to run the query above.

Community
  • 1
  • 1
Roee Adler
  • 33,434
  • 32
  • 105
  • 133
  • 1
    Here's how I understand the above: For each name, it groups them (only one if unique; several into one if duplicates), selects the smallest ID from the set, and then deletes any row whose ID doesn't exist in the table. Brilliant :) Thanks much Rax. – Gulbahar Aug 17 '09 at 09:16
  • 2
    in mysql I get the following error when sending this query: `"error 1093 (HY000) but it gives an error 'You cant specify target table 'members' for update in FROM clause"` any ideas? – David LeBauer Feb 21 '11 at 18:07
  • 1
    the problem was that 'members' was both the field and table name. this is what worked: delete from members where id not in (select min(id) from (select * from members) as x group by name) – David LeBauer Feb 21 '11 at 20:30
  • 1
    Thank you. You made my day. – Priya Jagtap Nov 24 '16 at 09:19
  • Can we do DELETE FROM members WHERE ID NOT IN ( SELECT name FROM members GROUP BY name HAVING COUNT(*) > 1; ) – mach2 Jun 06 '18 at 07:05
  • `ROWID` (instead of `name`) can be used to differentiate rows if they are completely identical. – xamgore Jan 09 '22 at 19:02
4

It would probably be easier to select the unique ones into a new table, drop the old table, then rename the temp table to replace it.

#create a table with same schema as members
CREATE TABLE tmp (...);

#insert the unique records
INSERT INTO tmp SELECT * FROM members GROUP BY name;

#swap it in
RENAME TABLE members TO members_old, tmp TO members;

#drop the old one
DROP TABLE members_old;
Paul Dixon
  • 295,876
  • 54
  • 310
  • 348
  • 1
    Thanks Paul. For those interested... CREATE TEMP TABLE tmp_members (name VARCHAR); INSERT INTO tmp_members SELECT name FROM members GROUP BY name; SELECT COUNT(name) FROM tmp_members; DELETE FROM members; VACUUM members; SELECT COUNT(name) FROM members; INSERT INTO members (name) SELECT * FROM tmp_members; SELECT COUNT(name) FROM members; SELECT DISTINCT COUNT(name) FROM members; SELECT name FROM members LIMIT 10; DROP TABLE tmp_members; – Gulbahar Aug 17 '09 at 09:11
  • Sorry, I missed that you were using SQLite! – Paul Dixon Aug 17 '09 at 09:14
0

We have a huge database where deleting duplicates is part of the regular maintenance process. We use DISTINCT to select the unique records then write them into a TEMPORARY TABLE. After TRUNCATE we write back the TEMPORARY data into the TABLE.

That is one way of doing it and works as a STORED PROCEDURE.

G Berdal
  • 1,134
  • 3
  • 14
  • 28
  • 1
    I have to admit Rax Olgud's answer is much-much more sophisticated and probably runs 100 times quicker! :) - I'm thinking about adopting the solution... Deserves +1! – G Berdal Aug 17 '09 at 13:00
0

If we want to see first which rows you are about to delete. Then delete them.

with MYCTE as (
    SELECT DuplicateKey1
        ,DuplicateKey2 --optional
        ,count(*) X
    FROM MyTable
    group by DuplicateKey1, DuplicateKey2
    having count(*) > 1
) 
SELECT E.*
FROM MyTable E
JOIN MYCTE cte
ON E.DuplicateKey1=cte.DuplicateKey1
    AND E.DuplicateKey2=cte.DuplicateKey2
ORDER BY E.DuplicateKey1, E.DuplicateKey2, CreatedAt

Full example at http://developer.azurewebsites.net/2014/09/better-sql-group-by-find-duplicate-data/

Lauri Lubi
  • 529
  • 5
  • 8
0

delete dup row keep one table has duplicate rows and may be some rows have no duplicate rows then it keep one rows if have duplicate or single in a table. table has two column id and name if we have to remove duplicate name from table and keep one. Its Work Fine at My end You have to Use this query.

DELETE FROM tablename
WHERE id NOT IN(

 SELECT id FROM
(
    SELECT MIN(id)AS id
    FROM tablename
    GROUP BY name HAVING 
    COUNT(*) > 1
)AS a )
AND id NOT IN(
(SELECT ids FROM
(
SELECT MIN(id)AS ids
    FROM tablename
    GROUP BY name HAVING 
    COUNT(*) =1
)AS a1
)
)

before delete table is below see the screenshot: enter image description here after delete table is below see the screenshot this query delete amit and akhil duplicate rows and keep one record (amit and akhil):

enter image description here

Akhil Singh
  • 700
  • 6
  • 17
0

You can join table with yourself by matched field and delete unmatching rows

DELETE t1 FROM table_name t1 
LEFT JOIN tablename t2 ON t1.match_field = t2.match_field
WHERE t1.id <> t2.id;
AnyKey
  • 79
  • 2
0

if you want to remove duplicate record from table.

CREATE TABLE tmp SELECT lastname, firstname, sex
FROM user_tbl;
GROUP BY (lastname, firstname);

DROP TABLE user_tbl;

ALTER TABLE tmp RENAME TO user_tbl;
Er Pkumar soni
  • 152
  • 1
  • 7
0

show record

SELECT `page_url`,count(*) FROM wl_meta_tags GROUP BY page_url HAVING count(*) > 1

delete record

DELETE FROM wl_meta_tags 
WHERE meta_id NOT IN( SELECT meta_id 
FROM ( SELECT MIN(meta_id)AS meta_id FROM wl_meta_tags GROUP BY page_url HAVING COUNT(*) > 1 )AS a ) 
AND meta_id NOT IN( (SELECT ids FROM (
SELECT MIN(meta_id)AS ids FROM wl_meta_tags GROUP BY page_url HAVING COUNT(*) =1 )AS a1 ) )

Source url

Ibrahim Hammed
  • 833
  • 1
  • 8
  • 17
0

DELETE FROM tablename WHERE ID IN( SELECT MAX(ID) ID FROM tablename GROUP BY IDNumber HAVING COUNT(IDNumber) > 1 )

Pidoski
  • 21
  • 4
-1
WITH CTE AS
(
    SELECT ROW_NUMBER() OVER (PARTITION BY [emp_id] ORDER BY [emp_id]) AS Row, * FROM employee_salary
)


DELETE FROM CTE
WHERE ROW <> 1
Jayendran
  • 9,638
  • 8
  • 60
  • 103