Deleting duplicate rows from a table

Question

I have a table in my database which has duplicate records that I want to delete. I don't want to create a new table with distinct entries for this. What I want is to delete duplicate entries from the existing table without the creation of any new table. Is there any way to do this?

 id           action
 L1_name      L1_data
 L2_name      L2_data
 L3_name      L3_data   
 L4_name      L4_data
 L5_name      L5_data
 L6_name      L6_data
 L7_name      L7_data
 L8_name      L8_data
 L9_name      L9_data
 L10_name     L10_data
 L11_name     L11_data
 L12_name     L12_data
 L13_name     L13_data 
 L14_name     L14_data
 L15_name     L15_data

see these all are my fields :
id is unique for every row.
L11_data is unique for respective action field.
L11_data is having company names while action is having name of the industries.

So in my data I'm having duplicate name of the companies in L11_data for their respective industries.

What I want is to have is unique name and other data of the companies in the particular industry stored in action. I hope I have stated my problem in a way that you people can understand it.

If you want a code answer, you'll need to give the schema of the table that has duplicate data. Also, you should leave the SQL tag on the question to get more views and raise the likelihood the question will be answered satisfactorily. — Welbog, Jun 25 '09 at 11:57

Roee Adler · Accepted Answer · 2009-06-25T15:44:55.193

19

Yes, assuming you have a unique ID field, you can delete all records that are the same except for the ID, but don't have "the minimum ID" for their group of values.

Example query:

DELETE FROM Table
WHERE ID NOT IN
(
SELECT MIN(ID)
FROM Table
GROUP BY Field1, Field2, Field3, ...
)

Notes:

I freely chose "Table" and "ID" as representative names
The list of fields ("Field1, Field2, ...") should include all fields except for the ID
This may be a slow query depending on the number of fields and rows, however I expect it would be okay compared to alternatives

EDIT: In case you don't have a unique index, my recommendation is to simply add an auto-incremental unique index. Mainly because it's good design, but also because it will allow you to run the query above.

edited Jun 25 '09 at 15:44

answered Jun 25 '09 at 11:52

Roee Adler

33,434
32
105
133

IDs are usually numeric so it should not be a problem, however actually it will work as long as "MIN" is defined on ID it will work. If it's defined on strings, and the field is unique, it will work great. – Roee Adler Jun 25 '09 at 15:03
I like your solution.. just wanted to clarify... it will be a problem if the table doesn't have a unique index too, it's good to have multiple options for a problem .. – Svetlozar Angelov Jun 25 '09 at 15:08
mysql does not let you `UPDATE`, `INSERT`, `DELETE`, rows in a table when referencing the same table in an inner query. See http://stackoverflow.com/questions/4429319/you-cant-specify-target-table-for-update-in-from-clause – Adam Joseph Looze Feb 27 '16 at 05:01

score 4 · Answer 2 · answered Jun 25 '09 at 11:55

4

ALTER IGNORE TABLE 'table' ADD UNIQUE INDEX(your cols);

Duplicates get NULL, then you can delete them

answered Jun 25 '09 at 11:55

Svetlozar Angelov

21,214
6
62
67

score 0 · Answer 3 · edited Feb 05 '13 at 19:02

DELETE
FROM table_x a
WHERE rowid < ANY (
  SELECT rowid
  FROM table_x b
  WHERE a.someField = b.someField
   AND a.someOtherField = b.someOtherField
  )
WHERE (
  a.someField,
  a.someOtherField
  ) IN (
  SELECT c.someField,
   c.someOtherField
  FROM table_x c
  GROUP BY c.someField,
   c.someOtherField
  HAVING count(*) > 1
  )

In above query the combination of someField and someOtherField must identify the duplicates distinctively.

Deleting duplicate rows from a table

3 Answers3

Linked

Related