Common elements in two lists preserving duplicates

Question

The goal is to find common elements in two lists while preserving duplicates.

For example,

Input:

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]

Expected output:

[3,5,5]

I tried set.intersection but set operatons would eliminate duplicates.

Does the order of the output elements matter? Would `[5,3,5]` be okay? — John Kugelman, Jul 21 '21 at 11:33
5 repeated twice in output because 5 repeated twice in both lists a and b. — Leo, Jul 21 '21 at 11:34
The order of the output elements doesn't matter. [5,3,5] is okay. — Leo, Jul 21 '21 at 11:34

score 3 · Accepted Answer · answered Jul 21 '21 at 11:36

3

Here is my suggestion:

from collections import Counter
ac=Counter(a)
bc=Counter(b)

res=[]

for i in set(a).intersection(set(b)):
    res.extend([i] * min(bc[i], ac[i]))

>>> print(res)
[3, 5, 5]

answered Jul 21 '21 at 11:36

IoaTzimas

10,538
2
13
30

By my count this performs four passes over the input: one for the two `Counter`s, one for the two `set`s, one to build the `intersection()`, and one for the loop over the intersection. Can some of them be combined, you think? – John Kugelman Jul 21 '21 at 11:44
@JohnKugelman Sure they can, but sometimes I prefer clearness and simplicity, especially if no issues regarding efficiency are mentioned. It can be done iven in one liner but it will be very ugly and unreadable – IoaTzimas Jul 21 '21 at 12:06

Darcy · Answer 2 · 2021-07-21T11:41:28.750

2

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]


def findout(a, b):
    a = a.copy()
    output = []
    for i in b:
        if i in a:
            a.remove(i)
            output.append(i)
    return output


result = findout(a, b)
print(result) # [3, 5, 5]

may work.

edited Jul 21 '21 at 11:41

answered Jul 21 '21 at 11:36

Darcy

160
1
9

1

Note that this doesn't scale well when the lists are large. It's O(len(a)*len(b)) while the optimal solution is O(len(a)+len(b)). – John Kugelman Jul 21 '21 at 11:38

score 2 · Answer 3 · answered Jul 21 '21 at 11:36

You can use a Counter of your lists and use those keys that occure in both and the minimal amount of their values:

from collections import Counter

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]

ca = Counter(a)
cb = Counter(b)


result = [a for b in ([key] * min(ca[key], cb[key])
                      for key in ca
                      if key in cb) for a in b]
print(result)

Output:

[3,5,5]

score 2 · Answer 4 · edited Jul 21 '21 at 11:50

2

Using Counter from collections module.

from collections import Counter

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]

ans = []
a_count = Counter(a)
b_count = Counter(b)


for i in a_count:
    if i in b_count:
        ans.extend([i]*min(a_count[i], b_count[i]))

print(ans)

Output

[3, 5, 5]

edited Jul 21 '21 at 11:50

John Kugelman

349,597
67
533
578

answered Jul 21 '21 at 11:46

Ram

4,724
2
14
22

score 1 · Answer 5 · answered Jul 21 '21 at 11:37

The answer depends if the lists are always sorted like in your example. If so, you can do a cursor approach where

index_a = 0
index_b = 0
common_elements = []

while index_a < len(a) and index_b < len(b):
    if a[index_a] < b[index_b]:
        # then a should check the next number, b should stay
        index_a += 1
    elif a[index_a] > b[index_b]:
        # then the reverse
        index_b += 1
    else:
        # they are equal
        common_elements.append(a[index_a])
        index_a += 1
        index_b += 1

However, if they are not sorted like that you're better off maybe doing the set intersection and then turning it back into a list and then for each element add duplicates to equal min(a.count(el), b.count(el))?

score 1 · Answer 6 · answered Jul 21 '21 at 14:55

That preserving duplicates got my head but finally got a solution

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
c=[]
def dublicate_finder(a,b):
    global c
    if len(a)>len(b):
        for i in range(len(b)):
            if b[i] in a:
                c.append(b[i])
                remove_index=a.index(b[i],0,len(a))
                del a[remove_index]
    if len(a)>len(b):
        for i in range(len(a)):
            if a[i] in b:
                c.append(a[i])
                remove_index=b.index(a[i],0,len(b))
                del a[remove_index]
    return c

score 0 · Answer 7 · answered Jul 21 '21 at 11:38

0

Try this. You can use the any operator to check if the element is equal to that in other list.

Then remove the element

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
l3=[]
for  i in b:
    if any(i==j for j in a):
        l3.append(i)
        a.remove(i)
        
print(l3)

answered Jul 21 '21 at 11:38

Note that this doesn't scale well when the lists are large. It's O(len(a)*len(b)) while the optimal solution is O(len(a)+len(b)). – John Kugelman Jul 21 '21 at 11:40
Also, note that if there is no copy of `a`, the lists at the beginning and the end are not the same. – Metapod Jul 21 '21 at 11:48

score -3 · Answer 8 · answered Jul 21 '21 at 11:34

-3

Although set.intersection removes duplicates, it can be very useful nonetheless:

a_set = set(a)
b_set = set(b)

intr = a_set.intersection(set_b)

result = [element for element in a if element in intr]

That should work

answered Jul 21 '21 at 11:34

Dr. Prof. Patrick

1,280
2
15
27

This doesn't count them properly. If you swap `a` and `b` it will return `5` four times not two. – John Kugelman Jul 21 '21 at 11:35
that corresponds to the example he attached – Dr. Prof. Patrick Jul 21 '21 at 11:35
@JohnKugelman ok, i didnt understand the problem – Dr. Prof. Patrick Jul 21 '21 at 11:37

Common elements in two lists preserving duplicates

8 Answers8