1

I have a table in SQL Server similar to this:

Emp#        CourseID        DateComplete        Status
1           Course1         21/05/2012          Failed
1           Course1         22/05/2012          Passed
2           Course2         22/05/2012          Passed
3           Course3         22/05/2012          Passed
4           Course1         31/01/2012          Failed
4           Course1         28/02/2012          Passed
4           Course2         28/02/2012          Passed

Trying to capture the newest record for each course for each emp#. And if the same course has been attempted on the same day capture the 'passed' course record.

Thinking something along these lines:

SELECT DISTINCT .....
        INTO Dup_Table
        FROM MainTable
GROUP BY ........
HAVING COUNT(*) > 1

DELETE MainTable
        WHERE Emp# IN (SELECT Emp# FROM Dup_Table)

INSERT MainTable SELECT * FROM Dup_Table

Drop Table Dup_Table
GO

But not sure if this is the

  1. the best approach and
  2. how to bring the Emp#/courseID/DateComplete/Status all together.
bitoshi.n
  • 2,278
  • 1
  • 16
  • 16
user1407577
  • 83
  • 1
  • 4
  • 12
  • 2
    What version of SQL Server are you using? – Lukas Eder May 21 '12 at 10:28
  • 2
    This question, with variations, has been asked multiple times for SQL Server alone. You could start with this question: [SQL - How can I remove duplicate rows?](http://stackoverflow.com/questions/18932/sql-how-can-i-remove-duplicate-rows). There many others in its *Linked* section. – Andriy M May 21 '12 at 10:34

2 Answers2

8
;WITH cte 
     AS (SELECT Row_number() OVER (partition BY EMPID, courseid ORDER BY 
                DateComplete 
                DESC, 
                status DESC) RN 
         FROM   MainTable) 
DELETE FROM cte 
WHERE  RN > 1 
Romil Kumar Jain
  • 20,239
  • 9
  • 63
  • 92
  • 1
    You don't need the sub query and `#temp` table. You can use the approach from my [answer here](http://stackoverflow.com/a/3822833/73226) – Martin Smith May 21 '12 at 12:08
  • 2
    Yes it will. There is only one table involved in the question `Maintable`. Deleting from the CTE defined on MainTable deletes rows from `Maintable` – Martin Smith May 21 '12 at 12:12
  • 3
    Thanks Martin, I was not aware that deleting records from CTE will reflect to Main table. +1 for that in your answer. – Romil Kumar Jain May 21 '12 at 12:19
2

You can use row_number() by partition and order by scope to get last record

Select *
From  (
    Select *,
           Row_Number() Over (Partition By Emp#, CourseID Order By DateComplete DESC, Case When Status = 'Passed' Then 1 Else 2 End  ) AS RecordNumber
    From #Emp)Z
Where Z.RecordNumber = 1
dequis
  • 2,100
  • 19
  • 25
mehdi lotfi
  • 11,194
  • 18
  • 82
  • 128