3

One of the pycon2011 talks shared this any() function trick; the explanation was that the loop is in C.

Could someone explain more about it? What is the trick behind it and is there any other use cases?

>>> import itertools, hashlib, time
>>> _md5 = hashlib.md5()
>>> def run():
...   for i in itertools.repeat('foo', 10000000):
...     _md5.update(i)
... 
>>> a = time.time(); run(); time.time() -a
3.9815599918365479
>>> _md5 = hashlib.md5()
>>> def run():
...   any(itertools.imap(_md5.update, itertools.repeat('foo', 10000000)))
... 
>>> a = time.time(); run(); time.time() -a
2.1475138664245605
>>> 
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
alexband
  • 153
  • 1
  • 2
  • 11

2 Answers2

4

itertools.imap creates a lazy list which contains the function to be evaluated (md5) and it's argument ('foo' string). The md5 calls are not evaluated at this point but prepared along with their arguments (I think they are called thunks). When you then pass this iterator to any function, it goes through all the elements evaluating them. This happens faster than explicit Python evaluation from the first program, because any is implemented in C and everything happens in the C library code without returning to the interpreter after each iterator element.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
MK.
  • 33,605
  • 18
  • 74
  • 111
0

There's no "trick" per se; running compiled C code is faster than running Python bytecode.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358