We're going through a bit of a clean-up exercise and I need to remove duplicate data that has accidentally been added to our database table. The ID is obviously different, but other fields are the same.
I can use the following query to select the duplicate data sets:
SELECT user_id, start_datetime, count(id) AS dup_count
FROM our_table
WHERE status = 1
GROUP BY user_id, start_datetime
HAVING count(id) > 1;
What I need to do is create a query that would take each of the duplicate IDs APART FROM THE FIRST and use that to update the status to 0.
I'm not sure I can do this is one query, but I think the steps are as follows:
- Run a query similar to the one above
- Extract all the IDs for the duplicate sets
- Ignore the first in the list as we don't want to alter the correctly added first record
- Run the update on the remaining set of IDs
Am I out of luck here - or is it possible to do?
Many thanks!