I've been told that Python sets are faster then lists when it comes to membership testing.
Despite that, timeit
shows that for a large amount of values lists are actually faster.
For smaller set with more repetitions the difference is smaller and even reversed, but still, no significant advantage to sets (and I guess performance issues are more important for very large sets of data, isn't it?)
How can that data be explained?
>>> import timeit
>>> # Few repetitions on a bigger set:
>>> timeit.timeit('10000 in set(range(10000000))', number=10)
9.265543753999737
>>> timeit.timeit('10000 in list(range(10000000))', number=10)
4.788996731000225
>>> # More repetitions on a smaller set:
>>> timeit.timeit('10000 in set(range(10000))', number=100000)
32.068307194000226
>>> timeit.timeit('10000 in list(range(10000))', number=100000)
32.45919990500079