How to identify and delete or update duplicate rows in Mysql table

Question

Due to a recent bug, we have a table with multiple duplicate entries.
What I want to do is to find and ideally delete (or perhaps just update) the duplicate rows.

PersonGroup
-----------
id
personId
groupId
type
primary
value

select count(*) cnt from personGroup pg where type="FOO" group by personId having cnt > 1;

yields nearly 20k rows. There should be 0. Each personId should have only one entry for any given type.

I can write a program to fix this scenario but before I do that I'm wondering if there is a purely SQL solution.

https://stackoverflow.com/questions/3311903/remove-duplicate-rows-in-mysql — Shawn, Dec 12 '18 at 22:27
This is absolutely NOT a duplicate of that other stack question...that question should be titled "AVOID duplicate rows in Mysql" This is a question about how to remove them once we have them. — kasdega, Dec 13 '18 at 02:14
@Nick please read this question and the other referenced question closer, this is not a duplicate. The accepted answer in the other stack refers to setting an unique index which won't help here. — kasdega, Dec 13 '18 at 02:15

score 1 · Accepted Answer · answered Dec 12 '18 at 22:09

1

Check this query. I think it is pretty simple and yet effective:

delete from persongroup
 where id not in (
    select max(id)
      from persongroup
     group by PersonId);

if your table is too big then you can consider to write this with inner join

 delete persongroup
   from persongroup
  inner join (
     select max(id) as lastId, personId
       from personGroup
      group by personId
     having count(*) > 1) dup on dup.personId = persongroup.personId
  where persongroup.id < dup.lastId;

above query is not tested

answered Dec 12 '18 at 22:09

Derviş Kayımbaşıoğlu

28,492
4
50
72

Thank you. I can test this for sure...I want to keep the most recent entry so reading through this I think that's what this will do...I'll test. – kasdega Dec 12 '18 at 22:11
yes exactly, it keeps most recent record by id – Derviş Kayımbaşıoğlu Dec 12 '18 at 22:17
This worked exactly as I needed it to...I had to add a group by `type` but that's because my original question didn't state that requirement. Thank you! – kasdega Dec 13 '18 at 02:12

How to identify and delete or update duplicate rows in Mysql table

1 Answers1