Suppose I have a dictionary of duplicate image IDs:
dict_duplicates = {0: [6], 1: [3], 2: [7], 3: [1], 4: [5], 5: [4], 6: [0], 7: [2]}
Where image 0 has a list of duplicates including image 6. Or, the reverse, where image 6 has a list of duplicates including image 0.
And I have a table that displays the image ID and the date it was created.
How can I create a list of unique images by earliest creation date?
To clarify this is what I was doing:
dups = set()
for key, value in ordered_dict_duplicates.items():
if key not in dups:
dups = dups.union(value)
Output:
{6: [0], 3: [1], 7: [2], 1: [3], 5: [4], 4: [5], 0: [6], 2: [7]}
6
{0}
3
{0, 1}
7
{0, 1, 2}
1
5
{0, 1, 2, 4}
4
0
2
- Image 6 is not in the master set of duplicates, add image 0 to the set. {0}
- Image 3 is not in the master set of duplicates, add image 1 to the set. {0, 1}
- Image 7 is not in the master set of duplicates, add image 2 to the set. {0, 1, 2}
This is where it "breaks".
- Image 1 has already been added to the master set of duplicates, skips image 3.
- Image 5 is not in the master set of duplicates, add image 4 to the set. {0, 1, 2, 4}
- 4 has already been added, skip.
- 0 has already been added, skip.
- 2 has already been added, skip
The problem is that image 3 is the earliest version of the image (9/18). Image 4 is dated (9/22).