0

The goal is to find common elements in two lists while preserving duplicates.

For example,

Input:

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]

Expected output:

[3,5,5]

I tried set.intersection but set operatons would eliminate duplicates.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Leo
  • 65
  • 2
  • 10

8 Answers8

3

Here is my suggestion:

from collections import Counter
ac=Counter(a)
bc=Counter(b)

res=[]

for i in set(a).intersection(set(b)):
    res.extend([i] * min(bc[i], ac[i]))

>>> print(res)
[3, 5, 5]
IoaTzimas
  • 10,538
  • 2
  • 13
  • 30
  • By my count this performs four passes over the input: one for the two `Counter`s, one for the two `set`s, one to build the `intersection()`, and one for the loop over the intersection. Can some of them be combined, you think? – John Kugelman Jul 21 '21 at 11:44
  • @JohnKugelman Sure they can, but sometimes I prefer clearness and simplicity, especially if no issues regarding efficiency are mentioned. It can be done iven in one liner but it will be very ugly and unreadable – IoaTzimas Jul 21 '21 at 12:06
2
a = [1,3,3,4,5,5]
b = [3,5,5,5,6]


def findout(a, b):
    a = a.copy()
    output = []
    for i in b:
        if i in a:
            a.remove(i)
            output.append(i)
    return output


result = findout(a, b)
print(result) # [3, 5, 5]

may work.

Darcy
  • 160
  • 1
  • 9
  • 1
    Note that this doesn't scale well when the lists are large. It's O(len(a)*len(b)) while the optimal solution is O(len(a)+len(b)). – John Kugelman Jul 21 '21 at 11:38
2

You can use a Counter of your lists and use those keys that occure in both and the minimal amount of their values:

from collections import Counter

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]

ca = Counter(a)
cb = Counter(b)


result = [a for b in ([key] * min(ca[key], cb[key])
                      for key in ca
                      if key in cb) for a in b]
print(result)

Output:

[3,5,5]
Patrick Artner
  • 50,409
  • 9
  • 43
  • 69
2

Using Counter from collections module.

from collections import Counter

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]

ans = []
a_count = Counter(a)
b_count = Counter(b)


for i in a_count:
    if i in b_count:
        ans.extend([i]*min(a_count[i], b_count[i]))

print(ans)

Output

[3, 5, 5]
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Ram
  • 4,724
  • 2
  • 14
  • 22
1

The answer depends if the lists are always sorted like in your example. If so, you can do a cursor approach where

index_a = 0
index_b = 0
common_elements = []

while index_a < len(a) and index_b < len(b):
    if a[index_a] < b[index_b]:
        # then a should check the next number, b should stay
        index_a += 1
    elif a[index_a] > b[index_b]:
        # then the reverse
        index_b += 1
    else:
        # they are equal
        common_elements.append(a[index_a])
        index_a += 1
        index_b += 1

However, if they are not sorted like that you're better off maybe doing the set intersection and then turning it back into a list and then for each element add duplicates to equal min(a.count(el), b.count(el))?

1

That preserving duplicates got my head but finally got a solution

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
c=[]
def dublicate_finder(a,b):
    global c
    if len(a)>len(b):
        for i in range(len(b)):
            if b[i] in a:
                c.append(b[i])
                remove_index=a.index(b[i],0,len(a))
                del a[remove_index]
    if len(a)>len(b):
        for i in range(len(a)):
            if a[i] in b:
                c.append(a[i])
                remove_index=b.index(a[i],0,len(b))
                del a[remove_index]
    return c
Faraaz Kurawle
  • 1,085
  • 6
  • 24
0

Try this. You can use the any operator to check if the element is equal to that in other list.

Then remove the element

a = [1,3,3,4,5,5]
b = [3,5,5,5,6]
l3=[]
for  i in b:
    if any(i==j for j in a):
        l3.append(i)
        a.remove(i)
        
print(l3)
  • Note that this doesn't scale well when the lists are large. It's O(len(a)*len(b)) while the optimal solution is O(len(a)+len(b)). – John Kugelman Jul 21 '21 at 11:40
  • Also, note that if there is no copy of `a`, the lists at the beginning and the end are not the same. – Metapod Jul 21 '21 at 11:48
-3

Although set.intersection removes duplicates, it can be very useful nonetheless:

a_set = set(a)
b_set = set(b)

intr = a_set.intersection(set_b)

result = [element for element in a if element in intr]

That should work

Dr. Prof. Patrick
  • 1,280
  • 2
  • 15
  • 27