Just use any:
if any(myvar in x for x in (r,s,t))
set lookups are 0(1)
so creating a union to check if the variable is in any set is totally unnecessary instead of simply checking using in
with any
which will short circuit as soon as a match is found and does not create a new set.
And I am also wondering if this union will affect somehow performance
Yes of course unioning the sets affects performance, it adds to the complexity, you are creating a new set every time which is O(len(r)+len(s)+len(t))
so you can say goodbye to the real point of using sets which are efficient lookups.
So the bottom line is that is you want to keep efficient lookups you will have to union the set once and keep them in memory creating a new variable then using that to do your lookup for myvar
so the initial creation will be 0(n)
and lookups will be 0(1)
thereafter.
If you don't every time you want to do a lookup first creating the union you will have a linear solution in the length of r+s+t -> set.union(*(r, s, t))
as opposed to at worst three constant(on average) lookups. That also means always adding or removing any elements from the new unioned set that are added/removed from r,s
or t
.
Some realistic timings on moderately large sized sets show exactly the difference:
In [1]: r = set(range(10000))
In [2]: s = set(range(10001,20000))
In [3]: t = set(range(20001,30000))
In [4]: timeit any(29000 in st for st in (r,s,t))
1000000 loops, best of 3: 869 ns per loop
In [5]: timeit 29000 in r | s | t
1000 loops, best of 3: 956 µs per loop
In [6]: timeit 29000 in reduce(lambda x,y :x.union(y),[r,s,t])
1000 loops, best of 3: 961 µs per loop
In [7]: timeit 29000 in r.union(s).union(t)
1000 loops, best of 3: 953 µs per loop
Timing the union shows that pretty much all the time is spent in the union calls:
In [8]: timeit r.union(s).union(t)
1000 loops, best of 3: 952 µs per loop
Using larger sets and getting the element in the last set:
In [15]: r = set(range(1000000))
In [16]: s = set(range(1000001,2000000))
In [17]: t = set(range(2000001,3000000))
In [18]: timeit any(2999999 in st for st in (r,s,t))
1000000 loops, best of 3: 878 ns per loop
In [19]: timeit 2999999 in reduce(lambda x,y :x.union(y),[r,s,t])
1 loops, best of 3: 161 ms per loop
In [20]: timeit 2999999 in r | s | t
10 loops, best of 3: 157 ms per loop
There is literally no difference no matter how large the sets get using any
but as the set sizes grow so does the running time using union.
The only way to make it faster would be to stick to or
but we are taking the difference of a few hundred nanoseconds which is the cost of creating the generator expression and the function call:
In [22]: timeit 2999999 in r or 2999999 in s or 2999999 in t
10000000 loops, best of 3: 152 ns per loop
To union sets set.union(*(r, s, t)) is also the fastest as you don't build intermediary sets:
In [47]: timeit 2999999 in set.union(*(r,s,t))
10 loops, best of 3: 108 ms per loop
In [49]: r | s | t == set.union(*(r,s,t))
Out[49]: True