0

I have a large list (~200,000 elements) and another list of size 1-9. I need to find the longest arrangement of the small list in the large list. So I need to find whether the small list is a subset of the large list, with no repetitions (i.e. is [a,a] a subset of [a,b,c] should return False). Is there a way to do this?

I have tried .issubset but this returns true for duplicates. I have also tried checking through each permutation of the small list but this is too slow.

benvc
  • 14,448
  • 4
  • 33
  • 54
Ollie
  • 105
  • 1
  • 1
  • 9

1 Answers1

1

It seems that you need a multiset (as suggested by your use of set), Python provides it's own multiset implementation through collections.Counter, so use it:

from collections import Counter

short = ['a','a']
large =  ['a','b','c']

short_counts = Counter(short)
large_counts = Counter(large)

print(all(value <= large_counts.get(key, 0) for key, value in short_counts.items()))

Output

False

Putting all in a function and adding some examples we get:

def sub_multi_set(sh, la):
    short_counts = Counter(sh)
    large_counts = Counter(la)
    return all(value <= large_counts.get(key, 0) for key, value in short_counts.items())


print(sub_multi_set(['a', 'a'], ['a', 'b', 'c']))
print(sub_multi_set(['a'], ['a', 'b', 'c']))
print(sub_multi_set(['a', 'a'], ['a', 'a', 'a', 'b', 'c']))
print(sub_multi_set(['a', 'b'], ['a', 'b', 'c']))
print(sub_multi_set(['d', 'b'], ['a', 'b', 'c']))

Output

False
True
True
True
False
Dani Mesejo
  • 61,499
  • 6
  • 49
  • 76