0

I was recently working on this question. Essentially, my answer involved a call to sorted with a lambda as the key:

sorted(range(len(L)), key=lambda i : L[i])

Given that performance was at the core of the question, and that lambdas are inherently slow, I could have optimized a bit by defining a function and using it in place of the lambda.

Still, I feel that that I'd be reinventing the wheel. There has to be a built-in function somewhere or in some importable module that provides the functionality of __getitem__ (which, the only reason I don't want to use, is that it's not really pythonic to use mangled methods).

I know about operator.getitem which let's me predefine an index i and get the element at i in any input sequence. But is there a function (say foo) that works as follows:

In [14]: g = operator.itemgetter(1)

In [15]: d = {'a':1, 'b':2, 'c':3, 'd':4}

In [16]: for i in d.iteritems():
   ....:     print g(i),
   ....:     
1 3 2 4

In [17]: L = list('abcd')

In [18]: g = foo(L)

In [19]: for i in range(4):
   ....:     print g(i),
   ....:     
'a' 'b' 'c' 'd'

Sorry if this is a duplicate question, but the search words that I could think of did not yield results.

Community
  • 1
  • 1
inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
  • And what is your expected behaviour when values are not unique? – wim Oct 09 '12 at 04:14
  • 3
    Lambdas are just functions like any other Python function. They are not slower or faster than a "long-form" function (with a `def` statement) that does the same thing. – BrenBarn Oct 09 '12 at 04:15
  • @wim: expected behavior with non-unique values - return the same answer as last time, just like how `L[i]` does assuming that `L` hasn't changed between the two calls – inspectorG4dget Oct 09 '12 at 04:24
  • so it seems to be just like `__getitem__` unless i'm missing something? you want something similar to dict.get but for lists? – wim Oct 09 '12 at 05:14
  • 2
    It's not pythonic to use `__getitem__` in _the wrong places_. This isn't one of the wrong places. It's perfectly sensible to use those methods if you need to pass a method into a function. – John La Rooy Oct 09 '12 at 05:26
  • @wim: just like `__getitem__`. But also that can be bound to a single container. So `L.__getitem__`; only, without being mangled – inspectorG4dget Oct 09 '12 at 05:26
  • Names aren't mangled for special methods. Name mangling happens if the name starts with double underscore, but doesn't end in double underscore. `L.__getitem__` does exactly what you want – John La Rooy Oct 09 '12 at 05:38
  • @gnibbler: Oh! so I was super misinformed! Thank you. ++rep @ you if you post this as an answer – inspectorG4dget Oct 09 '12 at 05:46
  • @JBernardo, already posted (and deleted) it as an answer although without the clarification between magic/special and mangled methods. – John La Rooy Oct 09 '12 at 05:52
  • Could you use `functools.partial()` with `operator.getitem()` and a container to make your `foo()`? – martineau Oct 09 '12 at 05:54
  • @martineau: How would you accomplish that? Could you post an example please? – inspectorG4dget Oct 09 '12 at 05:59

1 Answers1

2

If I've understood what you want correctly, the following would do that:

import functools
import operator

L = list('abcd')

def foo(indexable):
    return functools.partial(operator.__getitem__, indexable)

g = foo(L)

for i in xrange(len(L)):
    print g(i),

Update:

I've experimented further and was surprised to discover a slightly faster solution, which is this nothing other than simply this:

def foo2(indexable):
    return indexable.__getitem__

Which, when run using a little testbed I threw together, produced the following results:

fastest to slowest *_test() function timings:
 10,000 elements, 1,000 timeit calls, best of 3

  foo2_test() : 1.46 (0.00 times slower)
lambda_test() : 4.15 (1.84 times slower)
   foo_test() : 4.28 (1.93 times slower)

Each test function used just access each element of a list in a tight loop using a different technique.

Curious about how this applied to your sorting answer to the linked question, I obtained these differing results using it for sorting a list rather than just accessing each of the list's elements once:

fastest to slowest *_test() function timings:
 10,000 elements, 1,000 timeit calls, best of 3

  foo2_test() : 13.03 (0.00 times slower)
   foo_test() : 14.70 (0.13 times slower)
lambda_test() : 16.25 (0.25 times slower)

While foo2() was the fastest In both cases, in the sorting version it was only so by a very small amount.

Here's a listing of the full testbed used to get the first set of results for simple access:

import functools
import operator

import timeit
import types

N = 1000
R = 3
SZ = 10000
SUFFIX = '_test'
SUFFIX_LEN = len(SUFFIX)

def setup():
    import random
    global a_list
    a_list = [random.randrange(100) for _ in xrange(SZ)]

def lambda_test():
    global a_list
    f = lambda i: a_list[i]
    for i in xrange(len(a_list)): f(i)

def foo(indexable):
    return functools.partial(operator.__getitem__, indexable)

def foo_test():
    global a_list
    g = foo(a_list)
    for i in xrange(len(a_list)): g(i)

def foo2(indexable):
    return indexable.__getitem__

def foo2_test():
    global a_list
    g = foo2(a_list)
    for i in xrange(len(a_list)): g(i)

# find all the functions named *SUFFIX in the global namespace
funcs = tuple(value for id,value in globals().items()
            if id.endswith(SUFFIX) and type(value) is types.FunctionType)

# run the timing tests and collect results
timings = [(f.func_name[:-SUFFIX_LEN],
            min(timeit.repeat(f, setup=setup, repeat=R, number=N))
           ) for f in funcs]
timings.sort(key=lambda x: x[1])  # sort by speed (ironic use of lambda?)
fastest = timings[0][1]  # time fastest one took to run
longest = max(len(t[0]) for t in timings) # len of longest func name (w/o suffix)

print 'fastest to slowest *_test() function timings:\n' \
      ' {:,d} elements, {:,d} timeit calls, best of {:d}\n'.format(SZ, N, R)

def times_slower(speed, fastest):
    return speed/fastest - 1.0

for i in timings:
    print "{0:>{width}}{suffix}() : {1:.2f} ({2:.2f} times slower)".format(
                i[0], i[1], times_slower(i[1], fastest), width=longest, suffix=SUFFIX)

And here's the portion that was different when testing sort usage:

def setup():
    import random
    global a_list
    a_list = [random.randrange(100) for _ in xrange(SZ)]

def lambda_test():
    global a_list
    sorted(range(len(a_list)), key=lambda i:a_list[i])

def foo(indexable):
    return functools.partial(operator.__getitem__, indexable)

def foo_test():
    global a_list
    sorted(range(len(a_list)), key=foo(a_list))

def foo2(indexable):
    return indexable.__getitem__

def foo2_test():
    global a_list
    sorted(range(len(a_list)), key=foo2(a_list))
martineau
  • 119,623
  • 25
  • 170
  • 301