1

This table only has 2 columns. There are rows considered "duplicate" when both of the row's columns match.

That is:

col1 col2
X    X
X    X
X    Y  -- this is not a duplicate

I want to delete the duplicates but leave at least one of them. It doesn't matter which because they are the same thing.

I've used variations of IN and JOIN, but I can't seem to get the DELETE outer query to only delete one of each row with duplicate columns.

CMCDragonkai
  • 6,222
  • 12
  • 56
  • 98

2 Answers2

1

Try this:

DELETE  a
FROM    mytable a
        LEFT JOIN
        (
            SELECT MIN(ID) ID, col1, col2
            FROM    mytable
            GROUP   BY col1, col2
        ) b ON  a.ID = b.ID AND
                a.col1 = b.col1 AND
                a.col2 = b.col2
WHERE   b.ID IS NULL

DEMO

Assuming ID as primary key column

EDIT:

However if you dont have the ID column then you can try like this:

ALTER IGNORE TABLE mytable
  ADD UNIQUE INDEX all_columns_uq
    (col1, col2) ;

DEMO

Rahul Tripathi
  • 168,305
  • 31
  • 280
  • 331
1
ALTER IGNORE TABLE table1 ADD UNIQUE INDEX idx_name (col1,col2);

OR

CREATE TABLE table1_temp AS
SELECT * FROM table1 GROUP BY col1, col2;

TRUNCATE TABLE table1;
INSERT INTO table1 SELECT * FROM table1_temp;

DROP TABLE table1_temp;

You may lose data with the second method on a live table though, also any other tables referencing it may not be too happy!

I'd suggest adding the unique index too, just to future proof yourself.

Arth
  • 12,789
  • 5
  • 37
  • 69
  • @strawberry please don't edit my answer to remove large chunks of text and make your own preferred/superficial changes.. add a comment if you want to suggest changes. – Arth Feb 05 '16 at 13:42