-2

Let suppose I have two sets

a = {1,2,3,4}
b = {2,2,5,7,3,3}

So when I take a intersection of these two sets , I also want duplicates in my result

c = a.intersection(b)
print (c)
{2,2,3,3}
  • Could you give an example of the output you want? – Arghya Saha Aug 19 '18 at 07:36
  • 4
    `set` can't have duplicate items. after defining `b = {2,2,5,7,3,3}` b will only contain `{2, 3, 5, 7}` better use `list` – Mohit Solanki Aug 19 '18 at 07:37
  • 1
    What are the rules for how the number of times something appears in the output? Do you really want `2` appearing twice in the output, even though it only appears once in `a`? – Mark Dickinson Aug 19 '18 at 07:46
  • If the answers below and the linked question don't cover your question adequately, please let me know, and I'll re-open your question. – PM 2Ring Aug 19 '18 at 07:48
  • 1
    @PM2Ring: This doesn't appear to be quite the same as the marked duplicate; that would give an intersection of `[2, 3]`, not `[2, 2, 3, 3]`. I'm still trying to figure out what the rules are that the OP is using here; it's not the usual multiset intersection. – Mark Dickinson Aug 19 '18 at 07:48
  • @MarkDickinson If the OP clarifies, and they need something beyond the existing answers, I'm happy to re-open. – PM 2Ring Aug 19 '18 at 07:50
  • @PM2Ring the duplicate doesn't yield the same result. It keeps the smallest number of elements instead of the max number (I tried to change & to | but it doesn't do the same). But yes, the inputs & outputs aren't clear. – Jean-François Fabre Aug 19 '18 at 07:59
  • 1
    @Jean-FrançoisFabre Understood. But until the OP clarifies we don't need more answers from people guessing the actual requirements. ;) – PM 2Ring Aug 19 '18 at 08:11

3 Answers3

3

first if you define a set like this, you'll lose duplicate elements at once. So define lists instead:

a = [1,2,3,4]
b = [2,2,5,7,3,3]

Now my suggestion: count elements of the lists using collections.Counter, intersect & take max count, then expand the counter:

import collections

c1 = collections.Counter(a)
c2 = collections.Counter(b)

c3 = collections.Counter({k:max(c1[k],c2[k]) for k in set(a).intersection(b)})

print(list(c3.elements()))

result:

[2, 2, 3, 3]
Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
2

As pointed out by @MohitSolanki, sets are not the right data structure for your use-case, because they cannot contain duplicates. Thus, b = {2,2,5,7,3,3} will print {2, 5, 7, 3} if you access your variable b and this is likely not what you want.

May I suggest using lists? A possible solution might look like this:

def intersection(x: list, y: list):
    s = set(y)
    return [v for v in x if v in s]

a = [1,2,3,4]
b = [2,2,5,7,3,3]
print(intersection(a, b))
print(intersection(b, a))


''' Output:
print(intersection(a, b)) -> [2, 3]
print(intersection(b, a)) -> [2, 2, 3, 3]
'''

Do you have questions in regards to my solution proposal? Also, please let me/us know if this answer is helpful to you! In case it is not, please provide me/us with more feedback. Cheers :-)

EDIT Incorporated the feedback from user @Jean-FrançoisFabre to improve the search complexity (imho by a constant → O(n-c) with n=len(y) and c=len(y)-len(set(y))) by introducing s = set(y) and performing a v in s in the intersection method. Thanks for pointing that out! This is in particular useful when the second argument contains A LOT of duplicates.

tafaust
  • 1,457
  • 16
  • 32
0

A set cannot contain duplicates, that's sort of the point of them.

so to extend your code:

>>> a = {1,2,3,4}
>>> b = {2,2,5,7,3,3}
>>> a
{1, 2, 3, 4}
>>> b
{2, 3, 5, 7}

Note that the duplicates in b have been automatically removed by being part of a set.

So first you must re-define your variables a and b as lists:

a = [1,2,3,4]
b = [2,2,5,7,3,3]  

Then I think your question is answered by: Intersection of two lists including duplicates?

FraggaMuffin
  • 3,915
  • 3
  • 22
  • 26