Not all of my solution methods provided below will be necessarily efficient. My goal is to demonstrate every possible solution method I can think of - at the end of my answer I provide "benchmark" results to show why or why not you should use one certain method over another. I believe that is a good way of learning, and I will shamelessly encourage such learning in my answers.
Subset + hash set
s
>>> a_list = [(1,5), (1,7), (2,3)]
>>>
>>> set([l[0] for l in a_list])
{1, 2}
>>>
>>> 1 in set([l[0] for l in a_list])
True
map()
, and anonymous functions
>>> a_list = [(1,5), (1,7), (2,3)]
>>>
>>> map(lambda x: x[0] == 1, a_list)
[True, True, False]
>>>
>>> True in set(map(lambda x: x[0] == 1, a_list))
True
filter
and anonymous functions
>>> a_list = [(1,5), (1,7), (2,3)]
>>>
>>> filter(lambda x: x[0] == 1, a_list)
[(1,5), (1,7)]
>>>
>>> len(filter(lambda x: x[0] == 1, a_list)) > 0 # non-empty list
True
MICROBENCHMARKS
Conditions
- 1000 items
- 100K repetition
- 0-100 random range
- Python 2.7.10, IPython 2.3.0
Script
from pprint import pprint
from random import randint
from timeit import timeit
N_ITEMS = 1000
N_SIM = 1 * (10 ** 5) # 100K = 100000
a_list = [(randint(0, 100), randint(0, 100)) for _ in range(N_ITEMS)]
set_membership_list_comprehension_time = timeit(
"1 in set([l[0] for l in a_list])",
number = N_SIM,
setup="from __main__ import a_list"
)
bool_membership_map_time = timeit(
"True in set(map(lambda x: x[0] == 1, a_list))",
number = N_SIM,
setup="from __main__ import a_list"
)
nonzero_length_filter_time = timeit(
"len(filter(lambda x: x[0] == 1, a_list)) > 0",
number = N_SIM,
setup="from __main__ import a_list"
)
any_list_comprehension_time = timeit(
"any(t[0] == 1 for t in a_list)",
number = N_SIM,
setup="from __main__ import a_list"
)
results = {
"any(t[0] == 1 for t in a_list)": any_list_comprehension_time,
"len(filter(lambda x: x[0] == 1, a_list)) > 0": nonzero_length_filter_time,
"True in set(map(lambda x: x[0] == 1, a_list))": bool_membership_map_time,
"1 in set([l[0] for l in a_list])": set_membership_list_comprehension_time
}
pprint(
sorted(results.items(), key = lambda x: x[1])
)
Results (in seconds)
[('any(t[0] == 1 for t in a_list)', 2.6685791015625), # winner - Martijn
('1 in set([l[0] for l in a_list])', 4.85234808921814),
('len(filter(lambda x: x[0] == 1, a_list)) > 0', 7.11224889755249),
('True in set(map(lambda x: x[0] == 1, a_list))', 10.343087911605835)]
Who's got the last laugh now? ... Martijn (at least I tried)
MORAL OF THE STORY: Don't spend more than 10 minutes "proving" your inferior solution is faster and more efficient on a small test data, when another user's answer is the de-facto correct one