0

I am using SQL Server 2005 and would like to delete duplicate records that have the oldest dates. For example, I have a training database that captures an employee number, class name and date. Some folks have taken the class more than once, but we only need to keep the record of the latest class attendance.

1 Answers1

1

I would use ROW_NUMBER in a CTE:

WITH CTE AS
(
    SELECT  ID, Scan, TrainingDate, TrainingType, 
        rn = ROW_NUMBER() Over (Partition By Scan, TrainingType 
                                Order By TrainingDate DESC)
    FROM dbo.SETD_SimpleScan
)
DELETE FROM CTE WHERE rn > 1

Replace DELETE with SELECT * to see what you're going to delete.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • I don't understand the answer. My table name is SETD_SimpleScan. The columns are ID (PK), Scan, TrainingDate and TrainingType. How would the query look with this info? – Peggy Hall Aug 15 '14 at 21:19
  • I want to delete the records that contain the earlier TrainingDate for each TrainingType for each Scan. – Peggy Hall Aug 15 '14 at 21:31
  • The query parses but when I go to run it, SQL Server Managment Studio closes. – Peggy Hall Aug 15 '14 at 21:34
  • @PeggyHall: it works without a problem in SQL-Server 2005(i'm also using it). But the editor in SSMS crashes if you use it there. At least i have this problem, i've even [asked a question](http://stackoverflow.com/questions/5091742/ssms-crashes-when-opening-design-query-editor) here some time ago. Just design and run it in the normal query window. – Tim Schmelter Aug 15 '14 at 21:35
  • The query ran and showed 674 row affected, but there are still duplicte trainings for some scans. – Peggy Hall Aug 15 '14 at 21:45
  • 1
    Disregard my last post. All duplicates are now gone! That was so cool!!! Not sure why it worked but very thankful that it did! – Peggy Hall Aug 15 '14 at 21:47