1

I have written a query to find the duplicate rows in a table. It outputs duplicate pairs (e.g. row: 1,2 after that you enounter row: 2,1)

SELECT m1.ID, m2.ID
FROM AccessLog_Manual m1
    INNER JOIN AccessLog_Manual m2 ON 
        m1.ACCESS = m2.ACCESS AND 
        m1.EMPLOYEEID = m2.EMPLOYEEID AND
        m1.LOGDATETIME = m2.LOGDATETIME AND
        m1.MORPHOACCESSID = m2.MORPHOACCESSID AND
        m1.ID <> m2.ID
ORDER BY m1.ID

Now I want to use this query in a delete statement, such that removing only one row of pairs. (i.e. for example row 1,2 and row 2,1, only delete 1 and let 2 non-deleted)

jso
  • 484
  • 5
  • 13
  • What RDBMS are you using? Some allow you to delete with a JOIN, some don't. – Matt Gibson Nov 08 '15 at 12:57
  • @MattGibson Sql Server –  Nov 08 '15 at 12:57
  • @jso This question is different. My query outputs duplicate pairs, appear ascending then descending. –  Nov 08 '15 at 13:00
  • 1
    Oh, have a look into this one then (it's not directly for SQLServer but it gets to the point): http://stackoverflow.com/questions/3921501/find-duplicates-or-more-in-mysql-delete-them-except-the-first-one-input?rq=1 – jso Nov 08 '15 at 13:02

1 Answers1

0

You could use row_number() to assign numbers to rows you consider duplicate. A number above 1 means the row is duplicate.

For example:

; with  numbered as
        (
        select  row_number() over (
                    partition by ACCESS, EMPOYEEID, LOGDATETIME, MORPHOACCESSID
                    order by ID) as rn
        ,       *
        from    AccessLog_Manual
        )
delete  numbered
where   rn > 1;

See it working at SQL Fiddle.

Andomar
  • 232,371
  • 49
  • 380
  • 404
  • How can I delete duplicate rows (as question states)? –  Nov 08 '15 at 13:11
  • No need for the `in`. Can just delete from the CTE directly http://stackoverflow.com/a/3822833/73226 – Martin Smith Nov 08 '15 at 13:12
  • @MartinSmith: Interesting, updated in answer. @Kilanny: the query in the answer deletes duplicates (using the `delete` statement) – Andomar Nov 08 '15 at 13:14
  • @Kilanny: The condition on `rn > 1` ensures that the first row survives. The `order by ID` clause means the remaining row will have the lowest ID of the group. – Andomar Nov 08 '15 at 13:15
  • Why did you abuse your dupe hammer to reopen this obvious duplicate? Your answer is now the same as the one I linked to? – Martin Smith Nov 08 '15 at 13:23