So we had a duplicate SQL scripts running on our server and didn't realize it till just recently. Essentially I have many rows where there are 2 entries with the same column x (crn).
The initially got entered with the same column y (status) as well. Our application has users update the column y (status). However now we have 2 rows one with a status of 'S' and one with a status of something other than 'S'. My goal:
DELETE everything from the table WHERE there is a duplicate CRN and the STATUS is S. I don't want to delete rows unless there is a duplicate, but if there is, I only want to delete the row with a status of 'S'. Also, I'd rather not delete both records if both have a status of S, but if I do, that isn't such a big deal because I will get the courses again in the next download.
I have started making a select statement to query the rows I want, but don't know how to do the ONLY SELECT IF DUPLICATE EXISTS part. I feel like I need to UNION or LEFT JOIN or something to only get records if a duplicate exists.
SELECT * FROM
cas_usuECourses
WHERE
crn IN (SELECT crn FROM cas_usuECourses GROUP BY crn having count(1) > 1)
AND status = 'S'
AND termCode = 201320
EDIT: Is there a way to say... the above, but if both dups have 'S' only delete one of them?
EDIT: I "think" this looks good to me. Any thoughts?
SELECT id FROM (
SELECT id, Row_Number() Over (Partition By crn ORDER BY id DESC) as ranking
FROM cas_usuECourses
WHERE status = 'S'
AND termCode = 201320
) as ranking
WHERE ranking = 1
I think this will give me all the ids where the status is 'S', and if there are two with 'S' this will give me the one that was created second. I found out that every entry in our termCode has duplicates, so... don't need to worry about checking for the duplicates.