1

I need to find all value that are present in all sublist of a larger list (they are all ids)

What I have tried is first getting all uniques values present in all list and they test each value but this is extremly slow on a big list

l1 = ["a", "b", "c", "d", "e", "f"]
l2 = ["b", "c", "e", "f", "g"]
l3 = [ "b", "c", "d", "e", "f", "h"]
LL = [l1, l2, l3]
LL
unique_ids = set(x for l in LL for x in l)

filter_id = []
lenList = len(LL)
for id in unique_ids:
    if sum(id in item for item in LL) == lenList:
        filter_id.append(id)

How could I speed up the search ?

Serk
  • 432
  • 4
  • 15

2 Answers2

1

I need to find all value that are present in all sublist of a larger list (they are all ids).

If we make those sublists into a single list, our "values that are present in all sublists" will be there exactly len(LL) times (In this case: 3). ;)

This can be done in a single line using Counter:

from collections import Counter

result = [key for key, value in Counter(elem for sub_list in LL for elem in set(sub_list)).items() if value == len(LL)]

Explanation:

  • set(sub_list) - we get rid of unexpected duplicates in sublists to not mess up our count
  • (elem for sub_list in LL for elem in set(sub_list)) - flattening it into a single iterable
  • Counter - returns the dictionary with how many times each element was present in the iterable
  • dict.items() gets keys and values as pairs
  • if value == len(LL) - filter the keys that are present in every sublist

Edit: For more readability what is what:

result = [key 
          for key, value in Counter(elem 
                                    for sub_list in LL 
                                    for elem in set(sub_list)
                                   ).items() 
          if value == len(LL)]
h4z3
  • 5,265
  • 1
  • 15
  • 29
  • This is very clever ! It's almost instant on a very large list of sublist ! Thanks ! – Serk Oct 11 '19 at 11:45
0

Create a set of the elements found in all lists:

from itertools import chain
{elem for elem in chain(l1, l2, l3) if elem in l1 and elem in l2 and elem in l3}
# {'f', 'c', 'e', 'b'}
ipaleka
  • 3,745
  • 2
  • 13
  • 33
  • My list of sublist has lots of list, I can't write them one by one in the code unfortunately but it's a nice solution otherwise ! – Serk Oct 11 '19 at 11:40