I'm looking for a good way to do the following:
We have a table Accounts
where some records have accidentally been imported twice, so they are duplicates. I have found that I can select all rows which are imported twice with the following query:
select name, vatnumber from Accounts
WHERE IsDeleted='false'
GROUP BY name, vatnumber
HAVING count(*) > 1
Basically: if the name
and the vatnumber
are the same, they are duplicates.
This gives me all the records that have been imported twice. Now I'm looking for a way to get the first Id of every double record so I can do something like:
UPDATE Accounts SET IsDeleted='true'
WHERE Id = (select id ...)
So basically, I'm trying to update the first row of every double record, so there are no more doubles. Can anyone please point me in the right direction? I haven't got the faintest clue on how to start doing this besides doing manual labour and I'm guessing there is a much easier way.
Sample data:
Id VatNumber Name
1 BE10128292 Microsoft
2 BE99292200 Google
3 BE10128292 Microsoft
4 BE99292200 Some other company
5 BE99292200 Google
Desired result:
Id VatNumber Name
1 BE10128292 Microsoft
2 BE99292200 Google
It doesn't matter whether I get the first Microsoft
or the last Microsoft
record really. Ideally, it would be the first though.