Delete duplicate rows in a table

Question

I have written a query to find the duplicate rows in a table. It outputs duplicate pairs (e.g. row: 1,2 after that you enounter row: 2,1)

SELECT m1.ID, m2.ID
FROM AccessLog_Manual m1
    INNER JOIN AccessLog_Manual m2 ON 
        m1.ACCESS = m2.ACCESS AND 
        m1.EMPLOYEEID = m2.EMPLOYEEID AND
        m1.LOGDATETIME = m2.LOGDATETIME AND
        m1.MORPHOACCESSID = m2.MORPHOACCESSID AND
        m1.ID <> m2.ID
ORDER BY m1.ID

Now I want to use this query in a delete statement, such that removing only one row of pairs. (i.e. for example row 1,2 and row 2,1, only delete 1 and let 2 non-deleted)

What RDBMS are you using? Some allow you to delete with a JOIN, some don't. — Matt Gibson, Nov 08 '15 at 12:57
@jso This question is different. My query outputs duplicate pairs, appear ascending then descending. — , Nov 08 '15 at 13:00
Oh, have a look into this one then (it's not directly for SQLServer but it gets to the point): http://stackoverflow.com/questions/3921501/find-duplicates-or-more-in-mysql-delete-them-except-the-first-one-input?rq=1 — jso, Nov 08 '15 at 13:02

Andomar · Accepted Answer · 2015-11-08T13:13:17.613

0

You could use row_number() to assign numbers to rows you consider duplicate. A number above 1 means the row is duplicate.

For example:

; with  numbered as
        (
        select  row_number() over (
                    partition by ACCESS, EMPOYEEID, LOGDATETIME, MORPHOACCESSID
                    order by ID) as rn
        ,       *
        from    AccessLog_Manual
        )
delete  numbered
where   rn > 1;

See it working at SQL Fiddle.

edited Nov 08 '15 at 13:13

answered Nov 08 '15 at 13:08

Andomar

232,371
49
380
404

How can I delete duplicate rows (as question states)? – Nov 08 '15 at 13:11
No need for the `in`. Can just delete from the CTE directly http://stackoverflow.com/a/3822833/73226 – Martin Smith Nov 08 '15 at 13:12
@MartinSmith: Interesting, updated in answer. @Kilanny: the query in the answer deletes duplicates (using the `delete` statement) – Andomar Nov 08 '15 at 13:14
@Kilanny: The condition on `rn > 1` ensures that the first row survives. The `order by ID` clause means the remaining row will have the lowest ID of the group. – Andomar Nov 08 '15 at 13:15
Why did you abuse your dupe hammer to reopen this obvious duplicate? Your answer is now the same as the one I linked to? – Martin Smith Nov 08 '15 at 13:23

Delete duplicate rows in a table

1 Answers1