As already mentioned by @Laurent and @sisanared, you can use the in
operator for either lists
or sets
to check for membership. For example:
found = x in some_list
if found:
#do stuff
else:
#other stuff
However, you mentioned that speed is an issue. TL;DR -- sets
are faster if the set
already exists. From https://wiki.python.org/moin/TimeComplexity, checking membership using the in
operator is O(n) for list
and O(1) for set
(like @enderland pointed out).
For 100,000 items, or for one-time-only checks it probably doesn't make much of a difference which you use, but for a larger number of items or situations where you'll be doing many checks, you should probably use a set
. I did a couple of tests from the interpreter and this is what I found (Python 2.7, i3 Windows 10 64bit):
import timeit
#Case 1: Timing includes building the list/set
def build_and_check_a_list(n):
a_list = [ '/'.join( ('http:stackoverflow.com',str(i)) ) for i in xrange(1,n+1) ]
check = '/'.join( ('http:stackoverflow.com',str(n)) )
found = check in a_list
return (a_list, found)
def build_and_check_a_set(n):
a_set = set( [ '/'.join( ('http:stackoverflow.com',str(i)) ) for i in xrange(1,n+1) ] )
check = '/'.join( ('http:stackoverflow.com',str(n)) )
found = check in a_set
return (a_set, found)
timeit.timeit('a_list, found = build_and_check_a_list(100000)', 'from __main__ import build_and_check_a_list', number=50)
3.211972302022332
timeit.timeit('a_set, found = build_and_check_a_set(100000)', 'from __main__ import build_and_check_a_set', number=50)
4.5497120006930345
#Case 2: The list/set already exists (timing excludes list/set creation)
check = '/'.join( ('http:stackoverflow.com',str(100000)) )
timeit.timeit('found = check in a_list', 'from __main__ import a_list, check', number=50)
0.12173540635194513
timeit.timeit('found = check in a_set', 'from __main__ import a_set, check', number=50)
1.01052391983103e-05
For 1 million entries, to build and/or check membership on my computer:
#Case 1: list/set creation included
timeit.timeit('a_list, found = build_and_check_a_list(1000000)', 'from __main__ import build_and_check_a_list', number=50)
35.71641090788398
timeit.timeit('a_set, found = build_and_check_a_set(1000000)', 'from __main__ import build_and_check_a_set', number=50)
51.41244436103625
#Case 2: list/set already exists
check = '/'.join( ('http:stackoverflow.com',str(1000000)) )
timeit.timeit('found = check in a_list', 'from __main__ import a_list, check', number=50)
1.3113457772124093
timeit.timeit('found = check in a_set', 'from __main__ import a_set, check', number=50)
8.180430086213164e-06