0

i'm trying to remove duplicates from a table with no primary key. the structure of the table look like this:

|cID|changeDate|changeTime|operator|
|001|01.01.2005|12:00:00  |a       |
|001|01.01.2005|12:00:00  |null    |
|002|01.01.2005|12:00:00  |a       |
|002|01.02.2005|12:00:00  |a       |
|002|01.02.2005|12:45:00  |a       |
|003|01.01.2005|12:00:00  |a       |
|003|01.01.2005|12:00:00  |a       |
|003|01.02.2005|12:00:00  |a       |
|003|01.03.2005|12:00:00  |a       |
|003|01.03.2005|12:00:00  |null    |

what i'm trying to achieve is: check for duplicates 'cID', keep the one which has the latest date in 'changeDate'. if two records have the same 'changeDate' keep the one with the latest 'changeTime'. And of this result if there are still duplicates keep the one that has not null in 'operator'.

the above table should look like this:

|cID|changeDate|changeTime|operator|
|001|01.01.2005|12:00:00  |a       |
|002|01.02.2005|12:45:00  |a       |
|003|01.03.2005|12:00:00  |a       |

DB is mysql, engine is innodb. i would like to achieve this without creating a new table.

my sql skills are limited, in fact almost nonexistant. i've been reading and searching for a while, and i'm not getting there...

i've tried different approaches (temp tables, select with inner join)

any help would be much apreciated.

Barmar
  • 741,623
  • 53
  • 500
  • 612
  • possible duplicate of [How can I remove duplicate rows?](http://stackoverflow.com/questions/18932/how-can-i-remove-duplicate-rows) – ModulusJoe Nov 07 '13 at 15:39

1 Answers1

1

see SQLfiddle demo

Preserve all rows you need in a temporary table:

SELECT t.* 
FROM (
SELECT ( 
    CASE cid 
    WHEN @curCId 
    THEN @curRow := @curRow + 1 
    ELSE @curRow := 1 AND @curCId := cid END
  ) AS rank,
  p.*
FROM      mytable p,(SELECT @curRow := 0, @curCId := '') r
ORDER BY cid,changedate desc,changetime desc,operator desc
) t
where rank =1

Then delete rows from yourtable

To finish, insert rows from temp table to yourtable

Community
  • 1
  • 1
Kobi
  • 2,494
  • 15
  • 30
  • thanks so much Kobi for your SQL and for the fiddle demo. I have tried this and it seems to return the approx amount of rows i was expecting. now i'm trying to instead of select to delete these rows. i've "fiddled" around but don't seem to get to make it work. i'm not quite sure where to place the delete statement... – user2965311 Nov 07 '13 at 16:41
  • in the demo, A) I insert the data you need in the temp table B) delete all rows from table with dupe C) insert all rows from temp to the other table. D) select data from the table ( there are only the rows you needed) – Kobi Nov 07 '13 at 16:48