Removing Duplicate Entries in MySQL Database without a primary Key

Question

I've received a reasonably big MySQL database with a lot of duplicates. Around a third of the database are duplicates. It is also missing a primary key.

The structure of the database is so:

unique_id | field01 | field02 | field03 | field04 | ...... | field26 | field27 |

Now while the unique_id is supposed to be, well, unique; there are a lot of duplicates.

Also, for a set of rows belonging to copies of a particular unique_id, the remaining columns may or may not be the same. For example, looking at unique_id 'id_1' below, field01 is same in both the instances, but field02 & field03 are not.

I would like to remove the repetitions and have only one copy of each unique_id survive. It does not matter which one survives.

Example:

id_1 | abc  | dfd  | NULL | ... | def | 
id_2 | abc  | daf  | ghi  | ... | 12a |
id_1 | abc  | xyz  | jkl  | ... | def |
id_4 | aaa  | bbb  | NULL | ... | def |
id_3 | NULL | bbb  | NULL | ... | 123 |
id_5 | 1e3  | NULL | NULL | ... | def |
id_3 | aaa  | bbb  | fds  | ... | def |
id_9 | awa  | bbb  | NULL | ... | 910 |

Should become:

id_1 | abc  | dfd  | NULL | ... | def | 
id_2 | abc  | daf  | ghi  | ... | 12a |
id_4 | aaa  | bbb  | NULL | ... | def |
id_3 | NULL | bbb  | NULL | ... | 123 |
id_5 | 1e3  | NULL | NULL | ... | def |
id_9 | awa  | bbb  | NULL | ... | 910 |

Or this is also fine:

id_2 | abc  | daf  | ghi  | ... | 12a |
id_1 | abc  | xyz  | jkl  | ... | def |
id_4 | aaa  | bbb  | NULL | ... | def |
id_5 | 1e3  | NULL | NULL | ... | def |
id_3 | aaa  | bbb  | fds  | ... | def |
id_9 | awa  | bbb  | NULL | ... | 910 |

Once this is done, I would need to set unique_is as the primary key.

Please advice the best and accurate way to do this. Thanks in advance.

Please refer to this thread you will get some insights about deleting duplicate values :: http://stackoverflow.com/questions/2728413/equivalent-of-oracles-rowid-in-mysql — Ajith Sasidharan, Dec 10 '12 at 08:58

fthiella · Answer 1 · 2012-12-10T09:04:48.997

You could add a unique index in your table, ignoring all errors, and let MySql delete all duplicated rows (except one) for you:

ALTER IGNORE TABLE your_table ADD UNIQUE INDEX dupidx (unique_id)

but if you still get an error, even if you use IGNORE clause, try this:

ALTER TABLE your_table ENGINE MyISAM;
ALTER TABLE IGNORE your_table ADD UNIQUE INDEX dupidx (unique_id);
ALTER TABLE your_table ENGINE InnoDB;

If you are just interested in showing your data, keeping just one row for each duplicated id, you could use this:

SELECT *
FROM your_table
GROUP BY unique_id

score 1 · Answer 2 · answered Oct 30 '14 at 15:18

1

I had that problem, and this worked great for me:

CREATE TABLE newtable SELECT * FROM oldtable GROUP BY unique_id;

Then drop the oldtable, and rename the new table to the old table name

answered Oct 30 '14 at 15:18

Brett Stubbs

669
1
9
19

score 0 · Answer 3 · edited Aug 10 '21 at 19:53

0

try this :

select unique_id, field01 , field02 , field03 from (table_name) group by unique_id having sum(unique_id)<1

edited Aug 10 '21 at 19:53

Ankit Bajpai

13,128
4
25
40

answered Dec 10 '12 at 09:32

dtnder

381
2
10
25

Removing Duplicate Entries in MySQL Database without a primary Key

3 Answers3