I have an application where I have a list of objects, and I need to create a sub list of those objects for which some property is minimum. I naturally started writing for thing in list:
type code, and then wondered whether it would be worth getting my head around these map and filter functions I'd read about from time to time, but ignored to now.
The task appears to break down into two stages, finding the minimum, and then creating the sub list. Unless I sorted the list, in which case I could slice the result, if I could think of a quick way to find where to slice it.
This is my benchmark code
import random
import time
LEN = 1000000
class Thing():
def __init__(self, data):
self.data = data
def __repr__(self):
return 'thing{}'.format(self.data)
things = []
for n in range(LEN):
things.append(Thing(random.randrange(2,6)))
# minimum by looping
startminloop = time.time()
mindata = 99
for thing in things:
if thing.data<mindata:
mindata = thing.data
stopminloop = time.time()
# map for minimum
startmap = time.time()
mindata = min(map(lambda x: x.data, things))
stopmap = time.time()
# create list by looping
startlistloop = time.time()
outlist = []
for thing in things:
if thing.data==mindata:
outlist.append(thing)
stoplistloop = time.time()
# list comprehension
startcomp = time.time()
outlist = [x for x in things if x.data==mindata]
stopcomp = time.time()
# list by filter
startfilter = time.time()
outlist = list(filter(lambda x: x.data==mindata, things))
stopfilter = time.time()
# sort for minimum
startsort = time.time()
things.sort(key=lambda x: x.data)
stopsort = time.time()
and I got the following results
minimum finding
loop 0.07794857025146484
sort 0.22138357162475586
map 0.11729907989501953
list creation
loop 0.09957003593444824
comp 0.06798076629638672
filt 0.13345861434936523
I was surprised that map and filter were both significantly slower than the naive looping method. While they are both fewer lines of code, they don't look much clearer to my beginner's eye. I'm interested to see list comprehension beats both methods.
As sort was fairly slow to find the minimum, I didn't go on to think about how to locate the correct place to slice the sorted list for my output, whether binary or linear search, or whether index could be forced to do it.
Is this just the wrong place to use map and filter?
Am I using them correctly, or as well as I could?
Is there a faster way (in pure python anyway) of doing this task?