0

this is my table test

id   identifier
---  ---------
1      zz
1      zzz
3      d 
5      w
7      v
8      q
9      cc
9      ccc

here I want to remove the duplicate id's and keep the latest id's. the identifier can have duplicate values it dose not matter but the id's should be unique.

I wrote this query to solve this problem but the problem is that it goes into a infinite loop.

please help me with this as I am not able to see the error. Thanks

 delete test
    from test
    inner join(
    select max(id) as lastId, identifier
    from test
    where id in (
              select id 
                from test
               group by id
              having count(*) > 1
       )
    group by id
    )dup on dup.id = test.id
    where test.id<dup.id
Phani Shashank
  • 98
  • 1
  • 10

4 Answers4

0

Look at How to delete duplicate rows with SQL?

And try this one(works for you want to do), i did with the identifier column, but with the date column as shown in the post is better.

DELETE FROM Test WHERE Identifier NOT IN
    (SELECT MAX(Identifier) FROM Test GROUP BY Id);

Now with the DateField:

id   identifier   DateField
---  ---------    ----------
1      zz         2013-02-01
1      zzz        2013-03-02
3      d          2013-03-02
5      w          2013-03-02
7      v          2013-03-02
8      q          2013-03-02
9      cc         2013-01-15
9      ccc        2013-03-02

that is the table, and row (1, zzz) is newer than (1,zz), you can know it by the DateField column, then this query deletes two rows (1, zz) and (9, cc) the oldest for Id's 1 and 9.

DELETE FROM Test WHERE Datefield NOT IN
    (SELECT MAX(Datefield) FROM Test GROUP BY Id);

in SQL Server 2008 R2 i didnt get any error.

Community
  • 1
  • 1
Logar314159
  • 503
  • 4
  • 16
  • Geting an error #1093 - You can't specify target table 'test' for update in FROM clause. looked into the problem it basically says'cannot delete from a table and select from the same table in a subquery' – Phani Shashank Apr 02 '13 at 01:00
  • This will not work if there is fully duplicated rows or duplicated identifier or max(identifier) is not relating to the latest id. – ljh Apr 02 '13 at 01:01
  • Is there no way to work around? my original table contains about 300,000 records. and I need to remove the duplicate id's and keep the latest id's. – Phani Shashank Apr 02 '13 at 01:22
  • does your table have a date column? – Logar314159 Apr 02 '13 at 15:10
  • No it has only 2 coulmns. I solved this issue. and updated the code – Phani Shashank Apr 02 '13 at 18:44
0

If you have an index on test(id, identifier), the following should be pretty efficient:

delete from test
    where test.identifer < (select maxid 
                            from (select max(identifier) as maxid from test t2 where t2.id = t.id
                                 ) a
                           )

The double nested query is a MySQL trick for referencing the update/delete table in the same query.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

CREATE TEMPORARY TABLE tmp_tb_name AS
SELECT id,SUBSTRING_INDEX(GROUP_CONCAT(identifier ORDER BY id DESC),',',1)
FROM tb_name GROUP BY id ORDER BY NULL;

TRUNCATE TABLE tb_name;

INSERT INTO tb_name SELECT tmp_tb_name;

DROP TEMPORARY TABLE IF EXISTS tmp_tb_name;

babaoqi
  • 96
  • 3
0

Update I have found a solution for the issue: This is how I did it to solve the issue This is the solution which worked for me when there are millions on entities in the table. Any other SQL query is creating a lot of processes and burdening the server.

$i=0;
    while($i<10)
    {

        $statement="SELECT * 
                      FROM  test 
                      WHERE  id = :i";
        $query=$db->prepare($statement);
        $query->bindParam(':i',$i,PDO::PARAM_INT);
        $query->execute();
        $results=$query->fetchAll(PDO::FETCH_ASSOC);
        $c=count($results);
        $temp=$results[$c-1];

        $statement="DELETE FROM test WHERE id= :i";
           $query=$db->prepare($statement);
             $query->bindParam(':i',$i,PDO::PARAM_INT);
             $query->execute();

        $statement="Insert into test values(:id,:identifier)";
        $query=$db->prepare($statement);
             $query->bindParam(':id',$temp['id'],PDO::PARAM_INT);
             $query->bindParam(':identifier',$temp['identifier'],PDO::PARAM_STR);
             $query->execute();
             $results=$query->fetchAll(PDO::FETCH_ASSOC);

        $i++;

    }
Phani Shashank
  • 98
  • 1
  • 10