3

How can I delete duplicate records and randomly retain one record in MYSQL?

NightShadeQueen
  • 3,284
  • 3
  • 24
  • 37
user1761173
  • 59
  • 1
  • 4
  • This is actually a duplicate of http://stackoverflow.com/questions/3311903/remove-duplicate-rows-in-mysql – Basilevs Sep 16 '14 at 16:31

1 Answers1

6

Use ALTER TABLE to add a unique index, and specify the IGNORE option so it doesn't get errors:

ALTER IGNORE TABLE mytable 
ADD UNIQUE INDEX (products_id, categories_id);

UPDATE:

To get the choice to be random, create a new table with the same schema as the old one, but with a unique key on these columns. Then do:

INSERT IGNORE INTO newtable
SELECT * from oldtable
ORDER BY RANDOM();

The first of each duplicate in the random ordering will be inserted into the new table, the rest will be ignored because of the IGNORE option.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • This deletes any duplicate records already inserted in the table? +1 for it! It's great! – Barranka Oct 20 '12 at 08:19
  • See also http://stackoverflow.com/questions/3767837/delete-duplicate-record-from-same-table-in-mysql for another way, but it assumes there's another column that's guaranteed to be different among all the duplicates. – Barmar Oct 20 '12 at 08:21
  • how does alter ignore retain a record? I want it to randomly select one record – user1761173 Oct 20 '12 at 08:25
  • It retains the first record of each duplicate set. See the [documentation](http://dev.mysql.com/doc/refman/5.0/en/alter-table.html). – Barmar Oct 20 '12 at 08:37
  • Are there anyways to randomly keep one record of the duplicates? – user1761173 Oct 20 '12 at 08:49
  • @user1761173 What's the point? if they're identical, what difference does it make if it's the first, the last, or any of them? – Barranka Oct 20 '12 at 15:48
  • @Barranka Those two columns being identical doesn't mean the rows are identical. – Michael Mior Oct 22 '12 at 00:51
  • @MichaelMior If it doesn't matter which row is kept, i think there's no point forcing a random choice ;-) – Barranka Oct 22 '12 at 04:58
  • only one of the column is identical. I want the other value of the column to be randomly selected – user1761173 Oct 22 '12 at 09:01