My question arises from answers posted to How to find the missing numbers in an arbitrary list in python 3?.
Most solutions suggest using something akin to
a = [10,12,13,8]
# get set of full numbers
allNums = set( (x for x in range(min(a),max(a)+1)))
# do some kind of set operation / symetric difference
This needs 2 iterations of a
to fetch min(a)
and max(a)
as values from the list to build the range that entails all numbers between min(a)
and max(a)
.
It is easy to simplify this to just one pass of a
:
def minmax(data):
"""Get the min and max of an iterable in O(n) time and constant space."""
minValue = data[0]
maxValue = data[0]
for d in data[1:]:
minValue = d if d < minValue else minValue
maxValue = d if d > maxValue else maxValue
return (minValue,maxValue)
that retrieves both in O(n) time and constant space.
Is there some way to do this with built-ins / modules in python?
Edit: Agreed: min() and max() are both O(n) as well - but used twice (which is constant and reduced to O(n) - yep) - but it is still slower to do it twice then once.
Edit with some benchmarkings:
import timeit
# 100k random numbers to min/max upon
data = """import random
random.seed(42)
data = random.choices(range(1000000),k=100000)"""
t1 = timeit.timeit("""
mi,ma=minmax(data)
""",setup="""
import functools
def minmax(aa):
return functools.reduce(lambda mm,xx : ( min(mm[0],xx),max(mm[1],xx)) , aa, ( aa[0],aa[0],))
""" + data, number = 1000 )
Simple min / max usage:
t2 = timeit.timeit("""
mi,ma=min(data),max(data)
""",setup=data, number = 1000)
One pass try with if/elif to cut on comparisons:
t3 = timeit.timeit("""
mi,ma=minmax(data)
""",setup="""
def minmax(data):
minValue = data[0]
maxValue = data[0]
for d in data[1:]:
if d < minValue: # changed to if / elif: in a vain attempt to make it faster
minValue = d # its closer to the proposed solution in the numpy-question
elif d > maxValue: # linked above
maxValue = d
return (minValue,maxValue)
""" + data, number = 1000)
One pass try without if/elif (more comparisons needed):
t4 = timeit.timeit("""
mi,ma=minmax(data)
""",setup="""
def minmax(data):
minValue = data[0]
maxValue = data[0]
for d in data[1:]:
minValue = d if d < minValue else minValue
maxValue = d if d > maxValue else maxValue
return (minValue,maxValue)
""" + data, number = 1000)
Which lead to:
minmanx-reduce: 148.5929143627707
default min + max: 3.376458476185718 # ouch .. seems we just use these
minmax1passOptimized: 15.975109436292087
minmax1pass: 20.29275910515082