3

In an exception handler for a CSP style process, I need to read and discard the entire contents of a channel in order to allow other processes that are blocking to send to it to complete. The interface presents a generator for receiving, is there a faster way to consume and discard the entire contents of a generator than the following?

for _ in chan:
    pass
Matt Joiner
  • 112,946
  • 110
  • 377
  • 526
  • http://stackoverflow.com/questions/3209789/what-is-the-most-pythonic-way-to-have-a-generator-expression-executed – Josh Lee Feb 21 '12 at 03:01

3 Answers3

6

There is a way that is slightly faster:

collections.deque(chan, maxlen=0)

Your code makes the intention much clearer, though, so you should measure if there is a discernible difference. I'd almost always prefer your code.

(I'd never use _ as a variable name, though. It tends to confuse people, clashes with _ in the interactive shell and with the common gettext alias.)

Edit: Here are some simple timings:

In [1]: import collections

In [2]: a = range(100000)

In [3]: timeit reduce(lambda _, __: None, a)
100 loops, best of 3: 13.5 ms per loop

In [4]: timeit for dummy in a: pass
1000 loops, best of 3: 1.75 ms per loop

In [5]: timeit collections.deque(a, maxlen=0)
1000 loops, best of 3: 1.51 ms per loop
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 7
    `_` is a common name for a throwaway variable, I thought. – David Z Feb 21 '12 at 03:00
  • @DavidZaslavsky: In some languages it has a special meaning (e.g. Go). It has become abundant on SO, even for Python, but it is a bad idea to use it in Python. It has no advantage to use such a name. Call it `dummy` and you will avoid any confusion. – Sven Marnach Feb 21 '12 at 03:04
  • I'm only talking about Python; obviously this wouldn't apply to a language where there is an inherent meaning to the underscore. Personally I find it much clearer that a variable is intended to be a dummy variable if it's named with an underscore. – David Z Feb 21 '12 at 03:16
  • 1
    @DavidZaslavsky: There is no strong convention to use it in Python. As far as I'm aware, it's not mentioned at all anywhere on `python.org`. And I've been asked literally dozens of times what this strange syntax means. Obviously nobody ever asked me why an unused variable is called `dummy` or `unused`. – Sven Marnach Feb 21 '12 at 03:22
  • Well, I can't cite a source for you, but I've been under the impression that it _is_ a fairly common convention. I mean, if you have a statistical sample of Python code showing otherwise, then certainly I'm wrong, but I know what I've heard. (This is a separate issue from code clarity.) – David Z Feb 21 '12 at 03:29
  • 5
    @DavidZaslavsky: It certainly is rather common. A few instances even appeared in Python's standard library. But there is a difference between "it's common" and "it's a convention". And there is certainly a difference between "it's common" and "it's a good idea". – Sven Marnach Feb 21 '12 at 03:35
  • Yeah, that's fair. It's not that important of a discussion, anyway (mostly irrelevant to the answer). – David Z Feb 21 '12 at 04:11
  • You could try `[x for x in a]` as well, but I don't think it would be faster. Worth a try though. – Lennart Regebro Feb 21 '12 at 16:18
  • Rather than "_" as an unused variable name I tend to call it "unused" or "unused_whateveritis". – gps Feb 21 '12 at 16:37
  • 1
    @LennartRegebro: It would be much, much faster to use `list(a)` instead of `[x for x in a]`. I timed `list(a)` together with the above options (it's very fast!), but I did not include the timings because it creates an unneeded list with all the results, which might need a lot of memory in some cases. I only included solutions that send everything to the Orcus immediately. – Sven Marnach Feb 21 '12 at 16:43
  • The dequeue option is actually hand-optimized in the c implementation of dequeue: https://github.com/python/cpython/blob/cffe0467ab7b164739693598826bd3860f01b11f/Modules/_collectionsmodule.c#L356 – Chronial Feb 24 '18 at 00:23
1

I've started using a deque that I can reuse if need be:

do_all = deque(maxlen=0).extend

Then I can consume generator expressions using:

do_all(poly.draw() for poly in model.polys)
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • But this isn't faster than `for poly in model.polys: poly.draw()`, nor is it more readable. Why do you use it? (This can be slightly faster to consume some iterable you already have, but explicitly constructing a generator just to consume it this way seems rather pointless to me.) – Sven Marnach Feb 21 '12 at 16:42
  • Is your comment based on an actual test, or gut feel? I've done some tests with this, and I get about 5% improvement, since do_all does the iterating in C, instead of iterating a Python variable `poly` (which must be guarded against any modifications in the body of the for loop). Most for loops it doesn't matter, but in my case, I am drawing *many*, many polys. (See my artwork at http://www.fractallography.com) – PaulMcG Feb 21 '12 at 17:27
  • It's based on actual tests, many of which I did quite some time ago. I just did the most basic ones again, see https://gist.github.com/1877613 – Sven Marnach Feb 21 '12 at 17:38
  • 2
    "do_all does the iterating in C" -- not if you pass in a generator expression. The generator expression creates a Python code object for the part that is supposed to be executed in every iteration. – Sven Marnach Feb 21 '12 at 17:43
  • I'll have to go back to look at the itertools module again, and see just what the heck Raymond Hettinger was talking about! Thanks for keeping me honest! – PaulMcG Feb 21 '12 at 17:46
0

You might try:

reduce(lambda _, __: None, chan)

But honestly I don't think you're going to do much better than the plain loop. "channel" suggests I/O which is going to be the bottleneck anyway.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153