0

Possible Duplicate:
How to delete Duplicates in MySQL table

I imported data that I had in plain text files into a mysql database. Now I found I had duplicated entries, which I want to delete. The duplicates are identified by a key which is not the primary key. Please remind that I have to keep one of the duplicated items.

the table, T1, with three duplicates, for example:

ID(unique, primary key) REAL_ID(char(11))

1 '01234567890'
2 '01234567891'
3 '01234567891'
4 '01234567891'
...

Now, I use

 SELECT ID AS x, COUNT(*) AS y FROM T1 GROUP BY x HAVING y>1;

to identify the duplicates. The result i

+------+-------------+
| ID   | REAL_ID     |
+------+-------------+
|    1 | 01234567891 |
|    2 | 01234567891 |
|    3 | 01234567891 |
+------+-------------+

I can even construct the list of the Ids I have to delete:

SELECT ID 
FROM T1 
  RIGHT JOIN ( (SELECT ID AS x, COUNT(*) AS y 
                FROM T1 
                GROUP BY x 
                HAVING y>1) AS T2 ) ON (T2.x=T1.REAL_ID) LIMIT 1,100;

The result is

+------+-------------+
| ID   | REAL_ID     |
+------+-------------+
|    2 | 01234567890 |
|    3 | 01234567890 |
+------+-------------+

Now, I need help on how to delete these entries.

Since is not possible to use DELETE in combination with subqueries, I was trying to flag all the duplicated entries in the column REAL_ID, and then use

DELETE FROM T1 WHERE REAL_ID='flag';

but I can't figure out how to flag these entries.

Community
  • 1
  • 1
chuse
  • 373
  • 3
  • 18
  • Duplicate of [How to delete duplicates in MySQL table](http://stackoverflow.com/questions/672702/), [How to delete duplicates in MySQL using CASE](http://stackoverflow.com/questions/2469006/), [How to delete duplicates on MySQL table](http://stackoverflow.com/questions/2630440/). – Jonathan Leffler Dec 12 '12 at 13:45

1 Answers1

0

You can do this:

DELETE t
FROM T1 t 
RIGHT JOIN 
( 
   SELECT ID, COUNT(*) AS y 
   FROM T1 
   GROUP BY ID
   HAVING y > 1 
) AS T2 ON T2.ID = t.REAL_ID;

Update: Note that you can't use LIMI or ORDER BY when deleting from multiple table, as quoted from DELETE:

For the multiple-table syntax, DELETE deletes from each tbl_name the rows that satisfy the conditions. In this case, ORDER BY and LIMIT cannot be used

Try this instead:

DELETE t
FROM T1 t 
WHERE REAL_ID IN
( 
  SELECT t1.REAL_ID
  FROM T1 t1
  RIGHT JOIN 
  ( 
     SELECT ID, COUNT(*) AS y 
     FROM T1 
     GROUP BY ID
     HAVING y > 1 
  ) AS T2 ON T2.ID = t.REAL_ID
  WHERE t1.REAL_ID IS NOT NULL
  ORDER BY t2.y DESC
  LIMIT 1, 1000
)
Mahmoud Gamal
  • 78,257
  • 17
  • 139
  • 164