What you want to do is use a projection that numbers each record within a given duplicate set. You can do that with a Windowing Function, like this:
SELECT Name, Owner
,Row_Number() OVER ( PARTITION BY Name, Owner ORDER BY Name, Owner, Birth) AS RowNum
FROM dbo.animals
ORDER BY Name, Owner
This should give you results like this:
Name Owner RowNum
Ecstasy Sacha 1
Ecstasy Sacha 2
Ecstasy Sacha 3
Gremlin Max 1
Gremlin Max 2
Gremlin Max 3
Outch Max 1
Outch Max 2
Outch Max 3
Now you want to convert this to a DELETE
statement that has a WHERE
clause targeting rows with RowNum > 1
. The way to use a windowing function with a DELETE
is to first include the windowing function as part of a common table expression (CTE), like this:
WITH dupes AS
(
SELECT Name, Owner,
Row_Number() OVER ( PARTITION BY Name, Owner ORDER BY Name, Owner, Birth) AS RowNum
FROM dbo.animals
)
DELETE FROM dupes WHERE RowNum > 1;
This will delete later duplicates, but leave row #1 for each group intact. The only trick now is to make sure row #1 is the correct row, since not all of your duplicates have the same values for the Birth
or Death
columns. This is the reason I included the Birth
column in the windowing function, while other answers (so far) have not. You need to decide if you want to keep the oldest animal or the youngest, and optionally change the Birth
order in the OVER
clause to match your needs.