1

Please dont give me minus. I want to ask a better method to solve this problem, because what I am doing now is becoming a huge burden for me.

Hier is my question: I have two lists. I want to make sure that none item from one list is in the other list.

In Python I have been working with the following lines... (Assuming List_s has 3 items.)

if List_s[0] not in List_big and List_s[1] not in List_big and List_s[2] not in List_big:   #none item from List_s should be in List_big
    do something
else:
    pass

These lines were actually OK for me, till I suddenly realize that I have to work with lists with >200 length. And I have a lot of lists to compare.

So how could I change the code? Thank you very much for your help!

doglas
  • 105
  • 1
  • 13
  • 3
    `any(x in List_big for x in List_s)` That is going to be a polynomial time, though. Probably better to make a `set` out of `List_big` so that can be linear time. – juanpa.arrivillaga Feb 22 '17 at 03:29

3 Answers3

2

You can convert one of the lists to a set and use set.intersection:

if not set(List_s).intersection(List_big):
    print('no common items between lists')

Note that the elements in both lists must be hashable.

Community
  • 1
  • 1
krock
  • 28,904
  • 13
  • 79
  • 85
2

You will get the same results whether you use:

set(List_s).isdisjoint(List_big)

or:

not set(List_s).intersection(List_big)

But set.isdisjoint is much faster. On my computer, isdisjoint takes about 150 nanoseconds to run against test data, while intersection takes 95 microseconds to run against the same data--about 630 times slower!

Tom Lynch
  • 893
  • 6
  • 13
1

For very large list,

import timeit
import random

L=[random.randrange(2000000) for x in xrange(1000000)]
M=[random.randrange(2000000) for x in xrange(1000000)]

start_time = timeit.default_timer()
print any(x in M for x in L)
#True
print timeit.default_timer() - start_time
#0.00981207940825

start_time = timeit.default_timer()
print not set(L).isdisjoint(M)
#True
print timeit.default_timer() - start_time
#0.164795298542

start_time = timeit.default_timer()
print True if set(L) & set(M) else False
#True
print timeit.default_timer() - start_time
#0.436377859225

start_time = timeit.default_timer()
print True if set(L).intersection(M) else False
#True
print timeit.default_timer() - start_time
#0.368563831022

Clearly,

print any(x in M for x in L)

is much more efficient

Keerthana Prabhakaran
  • 3,766
  • 1
  • 13
  • 23