1

I have a table like this:

table

id
=====
1
1
2
2
6
6
7
5
5
9

I need to remove duplicates other than creating a new table. One way of doing it is:

create table_new (select distinct id from table )

Is there any delete command which removes the duplicates?

Mike Mackintosh
  • 13,917
  • 6
  • 60
  • 87
syncdm2012
  • 445
  • 2
  • 10
  • 22
  • 2
    possible duplicate of [How to remove duplicate entries from a mysql db?](http://stackoverflow.com/questions/2867530/how-to-remove-duplicate-entries-from-a-mysql-db) – rob mayoff Aug 08 '12 at 07:13
  • See here: http://stackoverflow.com/questions/11557757/cleaning-up-db-of-redundant-data/11557812#11557812 – Omesh Aug 08 '12 at 11:11

7 Answers7

1

IMHO in common case (Ansi SQL) you can't do it because without unique key field you can't distinguish one row from another so you can't delete it.

But in MySQL you can run this UGLY query. It uses @S sting to accumulate ID's so DON'T use it on a big table:

set @S:=',';
delete from t 
where if(LOCATE(CONCAT(',',ID,','),@S)>0,'del',@S:=CONCAT(@S,ID,','))='del'
valex
  • 23,966
  • 7
  • 43
  • 60
0

First hit on google:

http://support.microsoft.com/kb/139444/en-us?fr=1

The article describes the use of a temporary table to hold the unique records, clearing the original table and adding the unique records back to it, you can probably do it by writing some stored procedure but i suppose you only have to do this once? then i would go with the method as described. Also to prevent future problem use the unique or primary key constraint on that field so that it is not possible to have 2 records with the same value.

Abdul Rahman
  • 2,097
  • 4
  • 28
  • 36
wterbeek
  • 451
  • 9
  • 26
0

You can use cte:

;WITH cte
     AS (SELECT ROW_NUMBER() OVER (PARTITION BY ID 
                                       ORDER BY ( SELECT 0)) RN
         FROM   #MyTable)
DELETE FROM cte
WHERE  RN > 1

This will avoid the need to create a temporary or holding table for your distinct records. If you have additional columns that may contain a duplicate value, then you can add them after ID in the PARTITION BY expression.

Although I would be concerned as to why you have duplicate ID values.

  • If this is a row ID, then I would suggest using IDENTITY and auto-increment the IDs.
  • If it is an item or reference ID, then I would try to avoid adding duplicates (this could be completely wrong depending on your structure)
Robert
  • 8,717
  • 2
  • 27
  • 34
0

A single query to delete all duplicates (tested in Oracle)

delete from my_table where rowid not in (select min(rowid) from temp_table group by id);

Explanation:

Oracle gives every row a rowid by default (I am sure other databases have same or similar thing).

Let's look at the inner query first. Group by clause groups the rows based on some criteria, in our case Id. We can group based on multiple columns as well (group by col1, col2). When we say min(rowid), we are just making sure that it return one rowid in case there are multiple rows with duplicate columns.

The outer delete query is simple, it just deletes everything else (duplicates) from table which is not returned by our inner query

Kamal
  • 5,462
  • 8
  • 45
  • 58
  • Is there any standard delete query which works in all databases – syncdm2012 Aug 08 '12 at 07:06
  • @Mdharmendra Haven't worked much on mysql, but it seems this is what you are looking for http://stackoverflow.com/questions/2728413/equivalent-of-oracles-rowid-in-mysql – Kamal Aug 08 '12 at 07:10
0

Sample code

http://sqlfiddle.com/#!3/56182/2

Satinder singh
  • 10,100
  • 16
  • 60
  • 102
0

I'm not sure if there is any direct implementation, but here is a function implementation which you might be interested at.

function RemoveDuplicates($TableName, $UniqueFieldName, $IDFieldName, $FirstFoundIDValue)
{
    $Query =    "DELETE  FROM ".$TableName." ".
                "WHERE    ".$IDFieldName." IN ".
                "( SELECT a.".$IDFieldName." ".
                    "FROM ".$TableName." a, ".$TableName." b ".
                    "WHERE (a.".$UniqueFieldName." = b.".$UniqueFieldName.") ".
                    "AND (a.".$IDFieldName." > ".$FirstFoundIDValue.") ".
                    "GROUP BY a.".$IDFieldName." ".
                    "HAVING COUNT(a.".$UniqueFieldName.") > 1 ".
                ")";
    db_query($Query);

// Example Use:            
// delete  from tblps_CommEquip 
// where PSID in    
// ( select a.PSID from tblps_CommEquip a, tblps_CommEquip b     
//      where    (a.CommEquipType = b.CommEquipType) 
//      AND (a.PSID > 80)    
//      group by a.PSID 
//      having count(a.CommEquipType) > 1
// )
} 
Mike Mackintosh
  • 13,917
  • 6
  • 60
  • 87
GeeksCook
  • 81
  • 1
  • 1
0

The best way for deleting duplicate rows is to provide a primary key column that uniquely identifies each record in the table.

Peter DeWeese
  • 18,141
  • 8
  • 79
  • 101
lojuego
  • 11