The MRG table contains 1B+ rows, spread over several smaller tables. Additional observations are added to the MRG table, which adds them to the last underlying table.
For some reason, there appears to be a couple of duplicate rows sharing the same primary key. I know that this is a limitation of the MRG tables and it is not really problematic in our case (it simply means useless redundancy). We believe that the duplicates were inserted following a system crash, such that the reprocessing wasn't properly handled by the application's code that deletes duplicates from a temporary table prior to insertion in the archive.
This being said, is there a set of best practices to maintain uniqueness that will not require too much processing time?