4

I would like to remove all but one row but in conjunction with PHP.

I think it will be a simpler solution for my problem, which is to remove duplicate information.

Ok... I just realised what i've already done is crap.

Basically i have a table (dev_discs) and i want to search for duplicates in the column 'discReference' and then delete all but one, thus leaving the one remaining row.

Is there an easy query to do this? I came across similar situations on the net and they're all complicated... I don't think its a very complicated scenario.

Harvey
  • 57
  • 1
  • 1
  • 3
  • What RDBMS is this for? If anything but `MySQL` [one of the answers here should work](http://stackoverflow.com/q/18932/73226) but MySQL is funny about referencing the table you are deleting from in a sub query. – Martin Smith Sep 10 '11 at 12:58
  • What are you trying to do? Do you want to delete all the duplicates of one row? Do you want to delete all the duplicates of a group of the rows? Do you want to delete all the duplicates of all the rows of a table? Do you want to delete all the rows of a table save 1? – xanatos Sep 10 '11 at 12:58
  • It would seem simplest to just a select query to create the delete queries, then just pick the one you want to keep. – James Black Sep 10 '11 at 13:00
  • Thank you everyone for your replies. My answer should now explain what i wanted to do. – Harvey Sep 10 '11 at 13:16
  • 1
    No, you haven't told us what RDBMS you are using, but I suspect it's MySQL. – xanatos Sep 10 '11 at 13:59

2 Answers2

0

No easy way, AFAIK. With PHP it shouldn't be too difficult though (I can think of no way you can do this easily just using SQL).

  1. Create a new table with the same structure as the original one:

    CREATE TABLE dev_discs_noduplicates ( ... );

  2. Fetch all non-duplicated data from the original table with a GROUP clause to eliminate duplicates:

    SELECT * FROM dev_discs WHERE 1 GROUP BY dupe_col1, dupe_col2;

  3. Iterate over the results and insert them to the new table:

    INSERT INTO dev_discs_noduplicates ...;

  4. Delete old table

    DROP TABLE dev_discs;

  5. Rename new table:

    RENAME TABLE dev_discs_nodulicates TO dev_discs;

[EDIT]

Since @xanatos rightfully remarked a possible relations loss in the database, here's an alternative solution involving SQL and PHP.

First, fetch unique rows:

SELECT id FROM dev_discs GROUP BY col1, col2;

Having these in a PHP array, implode it and use in the delete query:

DELETE FROM dev_discs WHERE (id) NOT IN ( @arr );

This should take care of all the possible issues.

mingos
  • 23,778
  • 12
  • 70
  • 107
  • In the mean time you have lost all the relations you had on your table, the ACL you had on your table... – xanatos Sep 10 '11 at 13:59
  • @xanatos: yes, if they are foreign keys and/or triggers, they are lost as well. I assume the OP can recreate these. Plus, he does not mention these, so it's logical that my answer does not include them either. I'm curious to know your solution, as I suspect it's bullet-proof ;). – mingos Sep 11 '11 at 13:16
  • It's quite simple :-) http://www.kodyaz.com/articles/delete-duplicate-rows-using-row-number-partition-by-order-by.aspx You use ROW_NUMBER() and CTE to delete the duplicates. I don't write it in a response because the OP didn't specify what SQL he is using, and I DO think persons that don't follow their questions don't deserve good answers. If you want, you are free to steal it from me and from the article :-) – xanatos Sep 11 '11 at 13:30
  • Interesting. Since the OP uses PHP, I assume it's most probable he's working wuth MySQL, not MSSQL. Does your solution work in MySQL, or is it MSSQL-specific? – mingos Sep 11 '11 at 13:43
  • I don't know, and I think too he is using MySQL. I don't think it works on MySQL. But if there is a primary key on his table then you should use it to delete duplicate rows. – xanatos Sep 11 '11 at 13:45
0
DELETE FROM discs dd WHERE EXISTS
    ( SELECT * FROM discs d2
     WHERE d2.discReference = dd.discReference
     AND d2.rownumber < dd.rownumber
    );

"rownumber" is an internal id, which is supplied by your DBMS. Could be called tid, oid.

wildplasser
  • 43,142
  • 8
  • 66
  • 109