0

I have a huge database full of duplicate entry errors. What is the fastest way to remove all of the errors?

Error: 1062 - Duplicate entry 'X' for key 'PRIMARY'

Thanks in advance.

  • Use [this](http://stackoverflow.com/questions/12113699/get-top-n-records-for-each-group-of-grouped-results) to get the duplicate IDs, then a separate query to remove them. – Marc B Aug 07 '13 at 20:41
  • @MarcB: The problem is that my database needs to be imported and when I import it, the duplicates generate an error. –  Aug 07 '13 at 20:49
  • 1
    Then you need to fix the import to send duplicate rows to an exception table and remove them from the data that gets imported to the production table. IMports are complex things and it is a best practicce to never import directly to a production table but to import to a staging table that has relaxed rules and then clean the data before trying to import it to the final destination. YOu need to write a complete import process. – HLGEM Aug 07 '13 at 20:58
  • @Ak1to: then you should say that in the first place. don't tell us you've got a booboo on your finger, then later on "oh by the way my arm was severed" – Marc B Aug 07 '13 at 21:04

1 Answers1

0

if it does no matter what is left in table do like that:

  1. dump datebase
  2. truncate all tables
  3. add PK whatever You need (empty db)
  4. in console replace your dump from insert to insert ignore
  5. recover from dump
  6. profit

i did this once, even not at production/not big DB - 200mb per dump - just some anoing relation issue at localhost. I know it's not pretty... but for me worked.

EDIT: if someone thinks that it's bad solution, please leave an comment - i would like to know how it could be better...

jaczes
  • 1,366
  • 2
  • 8
  • 16