1

I have this table in MySQL:

  • id_word
  • lang_original (the language from the original word) VARCHAR(2)
  • lang_target (the language from the translated word) VARCHAR(2)
  • word (the word itself) VARCHAR(50)
  • translation (the translation) VARCHAR(50)

They should not have duplicates. Is it possible to have a sql query that finds duplicates and deletes them (leaving the first match undeleted)?

Update a duplicate would be something that has the same lang_original,lang_target and word (only those 3 fields).

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Andres SK
  • 10,779
  • 25
  • 90
  • 152
  • possible duplicate of [SQL - How can I remove duplicate rows?](http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows) and many, many others. – Andriy M Dec 08 '11 at 08:52

3 Answers3

1

Could work like this:

DELETE FROM tbl
WHERE  EXISTS (
   SELECT *
   FROM   tbl t
   WHERE (t.lang_original,   t.lang_target,   t.word)
       = (tbl.lang_original, tbl.lang_target, tbl.word)
   AND tbl.id_word > t.id_word
   )

If @Jason is right, and MySQL does not allow to reference the delete table, here is another form that works independently:

DELETE FROM tbl
USING (
   SELECT min(id_word) AS min_id, lang_original, lang_target, word
   FROM   tbl t
   GROUP  BY lang_original, lang_target, word
   HAVING count(*) > 1
   ) x
WHERE (tbl.lang_original, tbl.lang_target, tbl.word)
   =  (  x.lang_original,   x.lang_target,   x.word)
AND    tbl.id_word > x.min_id

Both variants leave the duplicate with the smallest id alive and kill the rest.

If you want to save all your translations to the word with the smallest id in a group of dupes first:

UPDATE tbl SET translation = all_trans
FROM  (
   SELECT min(id_word) AS min_id, group_concat(translation) AS all_trans
   FROM   tbl
   GROUP  BY lang_original, lang_target, word
   HAVING count(*) > 1
   ) t
WHERE  tbl.id_word = t.min_id
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
1

It's simpler to create a new table. Previous answers are good, but I like it this way: Create a new table with a unique key for "lang_original, lang_target, word"

CREATE TABLE new_table_can_be_renamed_later ( 
  ..your-fields...
  UNIQUE unique (lang_original,lang_target,workd)
);

Then fill your new table with by selecting the old table and use IGNORE in your INSERT

INSERT IGNORE INTO new_table_can_be_renamed_later 
  SELECT * FROM original_table

Please consider Mysql docs for right syntax.

vik
  • 762
  • 1
  • 7
  • 18
0

I'm not sure that you can do that. You are probably better off doing something like

select distinct * into yournewtable from originaltable

That may work.

Johnny Rocket
  • 1,394
  • 3
  • 17
  • 25