2

I have a table in my MySql server with the following columns: ID (int, key), type (int), name (varchar).

Due to an error in my application, duplicate entries was inserted to the db, i want to delete those entries so from each type & name pair there will be only one row.

any thoughts on how to do this?

Ran
  • 3,455
  • 12
  • 47
  • 60
  • possible duplicate of [What's the best way to dedupe a table?](http://stackoverflow.com/questions/2230295/whats-the-best-way-to-dedupe-a-table) – Lukas Eder Aug 26 '11 at 15:22

4 Answers4

1

That depends on what you want to keep and what you want to remove. Since ID is a key, I'm guessing that there are no duplicate ID's but duplicate type/name pairs. So here's an idea on how to remove them:

delete from my_table t1
where exists (select 1 
                from my_table t2
               where t2.type = t1.type
                 and t2.name = t1.name
                 and t2.id < t1.id)

That will keep the "duplicate" with the lowest ID

                 and t2.id > t1.id

That would keep the "duplicate" with the highest ID

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
  • You've got the same direction on the condition twice, both of which will save the lowest `id`. – Clockwork-Muse Aug 26 '11 at 15:44
  • i've tried running the following: `delete from s_relations t1 where exists (select 1 from s_relations t2 where t2.source_persona_id = t1.source_persona_id and t2.relation_type = t1.relation_type and t2.message_id = t1.message_id and t2.target_object_id = t1.target_object_id and t1.id > t2.id)` but got the following error: `SQL Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 't1 where exists (select 1 from s_relations t2 ' at line 1` – Ran Aug 26 '11 at 21:32
0

Obviously change this query to a select statement first to ensure the correct records are being selected for deletion:

delete from table as t1
using table as t2
where t1.type = t2.type and t1.name = t2.name and t1.id > t2.id
Derek
  • 21,828
  • 7
  • 53
  • 61
  • Interesting syntax, `DELETE .. USING`. I guess that's MySQL-specific? – Lukas Eder Aug 26 '11 at 15:20
  • i tried running the script: `delete from s_relations as t1 using s_relations as t2 where t2.source_persona_id = t1.source_persona_id and t2.relation_type = t1.relation_type and t2.message_id = t1.message_id and t2.target_object_id = t1.target_object_id and t1.id > t2.id`, but got the following error: `SQL Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'as t1 using s_relations as t2 where t2.source_persona_id = t1.source_persona_i' at line 1` – Ran Aug 26 '11 at 21:07
0

You need to select distinct into a new table, then remove the old table and rename the new table. But there are lots of ways to get this done:

What's the best way to dedupe a table?

Community
  • 1
  • 1
Daniel Pereira
  • 1,785
  • 12
  • 10
0

i ended up using the solution from this post: http://www.justin-cook.com/wp/2006/12/12/remove-duplicate-entries-rows-a-mysql-database-table/

basically, i've create a new table and copied the data from the old table to the new table without the duplications by using group by then i dropped the old table and renamed the new one.

Thanks All.

Ran
  • 3,455
  • 12
  • 47
  • 60