Extract list of duplicate values and locations from array

Question

Given an array a of length N, which is a list of integers, I want to extract the duplicate values, where I have a seperate list for each value containing the location of the duplicates. In pseudo-math:

If |M| > 1:
  val -> M = { i | a[i] == val }

Example (N=11):

a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]

should give the following lists:

3  -> [1, 6, 7]
1  -> [2, 5]
10 -> [9, 10]

I added the python tag since I'm currently programming in that language (numpy and scipy are available), but I'm more interestead in a general algorithm of how to do it. Code examples are fine, though.

One idea, which I did not yet flesh out: Construct a list of tuples, pairing each entry of a with its index: (i, a[i]). Sort the list with the second entry as key, then check consecutive entries for which the second entry is the same.

Possible duplicate: http://stackoverflow.com/questions/9835762/find-and-list-duplicates-in-python-list — Slater Victoroff, Aug 20 '13 at 15:38
@SlaterTyranus: Nope. Read the question again, I need the list of duplicates _and their locations_. — Markus, Aug 20 '13 at 15:43
It's like a 12 character change from that question though. Close enough that you should be able to figure it out from there. — Slater Victoroff, Aug 20 '13 at 15:44

Brionius · Answer 1 · 2013-08-20T15:53:07.100

4

Here's an implementation using a python dictionary (actually a defaultdict, for convenience)

a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
from collections import defaultdict
d = defaultdict(list)

for k, item in enumerate(a):
    d[item].append(k)
finalD = {key : value for key, value in d.items() if len(value) > 1}  # Filter dict for items that only occurred once.

print(finalD)    
# {1: [2, 5], 10: [9, 10], 3: [1, 6, 7]}

edited Aug 20 '13 at 15:53

answered Aug 20 '13 at 15:45

Brionius

13,858
3
38
49

That final line should really be a dict comprehension – Slater Victoroff Aug 20 '13 at 15:50
@SlaterTyranus Good idea. For some reason I never remember that dictionary comprehensions exist. – Brionius Aug 20 '13 at 15:53

score 3 · Accepted Answer · answered Aug 20 '13 at 15:50

The idea is to create a dictionary mapping the values to the list of the position where it appears.

This can be done in a simple way with setdefault. This can also be done using defaultdict.

>>> a = [0, 3, 1, 6, 8, 1, 3, 3, 2, 10, 10]
>>> dup={}
>>> for i,x in enumerate(a):
...     dup.setdefault(x,[]).append(i)
...
>>> dup
{0: [0], 1: [2, 5], 2: [8], 3: [1, 6, 7], 6: [3], 8: [4], 10: [9, 10]}

Then, actual duplicates can be extracted using set comprehension to filter out elements appearing only once.

>>> {i:x for i,x in dup.iteritems() if len(x)>1}
{1: [2, 5], 10: [9, 10], 3: [1, 6, 7]}

I'm accepting your answer since the set comprehension seems to be the cleaner looking solution. — Markus, Aug 21 '13 at 09:11

score 1 · Answer 3 · answered Aug 20 '13 at 15:45

Populate a dictionary whose keys are the values of the integers, and whose values are the lists of positions of those keys. Then go through that dictionary and remove all key/value pairs with only one position. You will be left with the ones that are duplicated.

Extract list of duplicate values and locations from array

3 Answers3