How do I delete duplicate rows and keep the first row?

Question

I made a mistake and I have unwanted duplicates.

I have a table with 4 key fields. A1, k1, k2, k3.

A1 is auto increment and the primary key.

the combination of k1, k2 and k3 is supposed to be unique and I have to delete the duplicate rows before I create a unique index. Some rows have one duplicate, some have many.

SELECT CONCAT(k1, k2, k) AS dup_value
  FROM myviews
 GROUP BY dup_value
HAVING (COUNT(dup_value) > 1)

shows me duplicates values that I need to deal with. But now I don't know how to keep one and delete the rest of each duplicate set.

There are a LOT of questions regarding this problem. This should be closed as duplicate - FAST. Do a search for "delete duplicate rows." Here's [the first result](http://stackoverflow.com/questions/3777633/delete-duplicate-rows-dont-delete-all-duplicate) for me. — rockerest, May 23 '11 at 21:30
@rockerest Interestingly I tried the solution you linked to first and it doesn't work with MySQL. The answer from OMG Ponies did work for me on MySQL so that's where my +1 goes. — jsleuth, Aug 13 '13 at 18:36

score 16 · Accepted Answer · edited Jun 20 '20 at 09:12

Backup your data, then...

MySQL supports JOINs in DELETE statements. If you want to keep the first of the duplicates:

DELETE a
  FROM MYVIEWS a
  JOIN (SELECT MIN(t.a1) AS min_a1, t.k1, t.k2, t.k3
          FROM MYVIEWS t
      GROUP BY t.k1, t.k2, t.k3
        HAVING COUNT(*) > 1) b ON b.k1 = a.k1
                              AND b.k2 = a.k2
                              AND b.k3 = a.k3
                              AND b.min_a1 != a.a1

If you want to keep the last of the duplicates:

DELETE a
  FROM MYVIEWS a
  JOIN (SELECT MAX(t.a1) AS max_a1, t.k1, t.k2, t.k3
          FROM MYVIEWS t
      GROUP BY t.k1, t.k2, t.k3
        HAVING COUNT(*) > 1) b ON b.k1 = a.k1
                              AND b.k2 = a.k2
                              AND b.k3 = a.k3
                              AND b.max_a1 != a.a1

I get an error #1093 - You can't specify target table 'myviews' for update in FROM clause — sdfor, May 23 '11 at 22:06
@sdfor: Sorry, forgot about that limitation -- try the updated code. — OMG Ponies, May 23 '11 at 22:08

score 2 · Answer 2 · answered May 23 '11 at 21:30

2

You can create a new table with the same structure but empty, then create the unique key on it, then do a INSERT IGNORE / SELECT * FROM the original table into the new table, then delete the original table.

INSERT IGNORE will automatically ignore any primary or unique key issues and just skip the duplicates.

answered May 23 '11 at 21:30

spanky

1,479
2
11
22

Not an ideal practice for Production systems -- queries will be depending on the old table, and there can't be more than one table in the database with the same name. This would require an outage. – OMG Ponies May 23 '11 at 21:45
Very true. I assumed OP's wasn't in production yet, due to altering keys and deleting tons of rows, but I shouldn't assume! – spanky May 23 '11 at 21:59

score 2 · Answer 3 · answered May 23 '11 at 21:34

2

Someting like this?

DELETE FROM myviews WHERE EXISTS(SELECT CONCAT(k1, k2, k) AS dup_value
FROM myviews
GROUP BY dup_value
HAVING (COUNT(dup_value) > 1));

answered May 23 '11 at 21:34

Automatico

12,420
9
82
110

Or maybe I misunderstood your question. Is it duplicity within one tuple? – Automatico May 23 '11 at 21:35
this is wrong (at least relative to the question) becouse it deletes all rows on duplicated sets – Saic Siquot Aug 26 '11 at 13:39
In addition to deleting all rows instead of all but one (as @LuisSiquot mentions) this will cause an error when run on tables: SQL Error (1093): You can't specify target table 'mytable' for update in FROM clause – ebyrob Nov 10 '16 at 16:52

score 0 · Answer 4 · answered May 23 '11 at 21:34

0

You need a separator in your concat function, because otherwise "a", "b", and "cd" is the same as "abcd", "", "".

answered May 23 '11 at 21:34

Nick ODell

15,465
3
32
66

How do I delete duplicate rows and keep the first row?

4 Answers4

Backup your data, then...

Linked