3

EDIT: I think I now have the solution but need to do some more sense checking...

DELETE TBLFIRE_TEMP3 FROM TBLFIRE_TEMP3
LEFT OUTER JOIN (
   SELECT MIN(FireNo) as FireNo, ActionRef, FRADate, FIREUPRN
   FROM TBLFIRE_TEMP3 
   GROUP BY ActionRef, FRADate, FIREUPRN
) as KeepRows ON
   TBLFIRE_TEMP3.FireNo = KeepRows.FireNo
WHERE
   KeepRows.FireNo IS NULL

-############### Previous Comments ###############

I have a table which has duplicates in (based on three columns). I can find them and see them by doing the following and would then simply want to delete the duplicates (i.e. so all count(*) results are '1')

SELECT COUNT(*),ActionRef, FRADate, FIREUPRN
FROM TBLTempTable
GROUP BY ActionRef, FRADate, FIREUPRN

So I can see the count of how many times these groups occur. What I want to do is Delete the duplicates. I've tried the below but it deletes every row, even singular:

DELETE a FROM TblTempTable a JOIN
(
  SELECT ActionRef, FRADate, FIREUPRN
    FROM TblTempTable 
   GROUP BY ActionRef, FRADate, FIREUPRN
) d 
   ON (a.ActionRef = b.ActionRef
  AND a.FRADate = b.FRADate
AND a.FIREUPRN = b.FIREUPRN)

Based on the codes I've looked at the guide me I believe I am close but currently it deletes everything.

References: SQL- How can I remove duplicate rows? GROUP BY does not remove duplicates

-These are MySQL so not to relevant in the end:

select and delete rows within groups using mysql Find duplicate records in MySQL

Community
  • 1
  • 1
indofraiser
  • 1,014
  • 3
  • 18
  • 50

2 Answers2

6

A simple solution is to use a CTE with ROW_NUMBER:

WITH Data AS
(
    SELECT RN  = ROW_NUMBER() OVER (PARTITION BY ActionRef, FRADate, FIREUPRN
                                    ORDER BY FRADate ASC),
           Cnt = COUNT(*) OVER (PARTITION BY ActionRef, FRADate, FIREUPRN),
           ActionRef, FRADate, FIREUPRN
    FROM TBLTempTable
)
DELETE FROM Data
WHERE RN > 1

This deletes all but one, it keeps the oldest FRADate. You need to change the ORDER BY in ROW_NUMBER to change this logic.

One advantage of a CTE is that you can change it easily to see what you're going to delete (or update). Therefore you just have to replace DELETE FROM Data with SELECT * FROM Data.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • I could take out the FRADate order as well as it's part of the Group By so it not relavant. I use the ID as something might have changed elsewhere. – indofraiser Apr 28 '15 at 09:29
  • 1
    @indofraiser: i haven't even recognized that it's part of the group-by. However, i didn't know how you want to sort so i've taken the first date column that i've seen. Replace it with something more meaningful, you can also use multiple columns. The same rules apply as to a regular `ORDER BY`. – Tim Schmelter Apr 28 '15 at 09:31
  • Great, just wanted to clarify :-) – indofraiser Apr 28 '15 at 09:32
1

There's a simpler method for readability too:

;WITH DEDUPE AS (
SELECT ROW_NUMBER() OVER(
    PARTITION BY ActionRef, FRADate, FIREUPRN
        ORDER BY (SELECT 1)) AS RN
FROM TBLTempTable)
DELETE FROM DEDUPE
WHERE RN != 1

We use this exact script at work on a daily basis. You can change the ORDER BY clause to any column, if you want to keep newer rows based on a date column etc.

John Bell
  • 2,350
  • 1
  • 14
  • 23