How to limit the size of a comprehension?

Question

I have a list and want to build (via a comprehension) another list. I would like this new list to be limited in size, via a condition

The following code will fail:

a = [1, 2, 1, 2, 1, 2]
b = [i for i in a if i == 1 and len(b) < 3]

with

Traceback (most recent call last):
  File "compr.py", line 2, in <module>
    b = [i for i in a if i == 1 and len(b) < 3]
  File "compr.py", line 2, in <listcomp>
    b = [i for i in a if i == 1 and len(b) < 3]
NameError: name 'b' is not defined

because b is not defined yet at the time the comprehension is built.

Is there a way to limit the size of the new list at build time?

Note: I could break the comprehension into a for loop with the proper break when a counter is reached but I would like to know if there is a mechanism which uses a comprehension.

score 77 · Accepted Answer · edited Feb 23 '17 at 07:17

77

You can use a generator expression to do the filtering, then use islice() to limit the number of iterations:

from itertools import islice

filtered = (i for i in a if i == 1)
b = list(islice(filtered, 3))

This ensures you don't do more work than you have to to produce those 3 elements.

Note that there is no point anymore in using a list comprehension here; a list comprehension can't be broken out of, you are locked into iterating to the end.

edited Feb 23 '17 at 07:17

WoJ

27,165
48
180
345

answered Feb 22 '17 at 14:00

Martijn Pieters

1,048,767
296
4,058
3,343

`[1/i for i in range(-5, 5)]` does break out and doesn't iterate to the end. – Stefan Pochmann Feb 22 '17 at 14:40
13

@StefanPochmann: it raises an exception, that's *not the same thing* as a `break` statement. In the end, you have no list result at all. – Martijn Pieters Feb 22 '17 at 14:41
Wasn't clear to me that you meant the `break` statement, that word can be understood in a more general way. [For example](http://stackoverflow.com/a/38675546/1672429) not long ago you said *"[`return`] breaks out of the loop"*. In any case, the iteration doesn't go to the end. Also, not having a list result doesn't even have to be a problem. Consider `reciprocals = [1/x for x in a]`, I think that's reasonable code and if `a` contains a zero then one might want a `ZeroDivisionError` and not want a list. – Stefan Pochmann Feb 22 '17 at 15:08
7

This is a question about how to limit the size of the list produced by a list comprehension, though. That implies you *still want a list result*. – Martijn Pieters Feb 22 '17 at 15:10

score 6 · Answer 2 · edited May 23 '17 at 12:25

6

@Martijn Pieters is completly right that itertools.islice is the best way to solve this. However if you don't mind an additional (external) library you can use iteration_utilities which wraps a lot of these itertools and their applications (and some additional ones). It could make this a bit easier, at least if you like functional programming:

>>> from iteration_utilities import Iterable

>>> Iterable([1, 2, 1, 2, 1, 2]).filter((1).__eq__)[:2].as_list()
[1, 1]

>>> (Iterable([1, 2, 1, 2, 1, 2])
...          .filter((1).__eq__)   # like "if item == 1"
...          [:2]                  # like "islice(iterable, 2)"
...          .as_list())           # like "list(iterable)"
[1, 1]

The iteration_utilities.Iterable class uses generators internally so it will only process as many items as neccessary until you call any of the as_* (or get_*) -methods.

^{Disclaimer: I'm the author of the iteration_utilities library.}

edited May 23 '17 at 12:25

Community

1
1

answered Feb 22 '17 at 14:16

MSeifert

145,886
38
333
352

1

This is a very nice library, thanks (still reading the docs to get a grasp on the multitude of functions) – WoJ Feb 22 '17 at 20:23
1

Might I recommend changing the first link to the project's home page: http://iteration-utilities.readthedocs.io/en/latest/? – jpmc26 Feb 23 '17 at 01:25
1

Note that using `(1).__eq__` means you'll get unpleasant results like `1.5` comparing equal to `1`, or `'potato'` comparing equal to `1`, because `NotImplemented` is considered true in a boolean context. (They added a DeprecationWarning for this a few years back, but DeprecationWarning is suppressed by default outside of `__main__`.) – user2357112 Aug 08 '23 at 06:55

score 4 · Answer 3 · edited May 23 '17 at 11:54

4

You could use itertools.count to generate a counter and itertools.takewhile to stop the iterating over a generator when the counter reaches the desired integer (3 in this case):

from itertools import count, takewhile
c = count()
b = list(takewhile(lambda x: next(c) < 3, (i for i in a if i == 1)))

Or a similar idea building a construct to raise StopIteration to terminate the generator. That is the closest you'll get to your original idea of breaking the list comprehension, but I would not recommend it as best practice:

c = count()
b = list(i if next(c) < 3 else next(iter([])) for i in a if i == 1)

Examples:

>>> a = [1,2,1,4,1,1,1,1]

>>> c = count()
>>> list(takewhile(lambda x: next(c) < 3, (i for i in a if i == 1)))
[1, 1, 1]

>>> c = count()
>>> list(i if next(c) < 3 else next(iter([])) for i in a if i == 1)
[1, 1, 1]

edited May 23 '17 at 11:54

Community

1
1

answered Feb 22 '17 at 15:16

Chris_Rands

38,994
14
83
119

What advantage over the other answers does this have? – jpmc26 Feb 23 '17 at 01:32
@jpmc26 I don't think it's better than Martijn's solution for this exact purpose, but it's more generalisable because the conditions for terminating the generator could be anything, not just a counter. Also the OP asked specifically about a list comprehension and this is the closest valid syntax to that – Chris_Rands Feb 23 '17 at 08:04
1

Fair enough. Thanks. Since you posted this well after the other answers, you might want to work something about the flexibility advantage into your answer. – jpmc26 Feb 23 '17 at 08:11

madaniel · Answer 4 · 2020-05-03T19:09:57.580

3

Same solution just without islice:

filtered = (i for i in a if i == 1)
b = [filtered.next() for j in range(3)]

BTW, pay attention if the generator is empty or if it has less than 3 - you'll get StopIteration Exception.

To prevent that, you may want to use next() with default value. For example:

b = [next(filtered, None) for j in range(3)]

And if you don't want 'None' in your list:

b = [i for i in b if i is not None]

edited May 03 '20 at 19:09

answered May 03 '20 at 16:00

madaniel

161
1
6

Much better!!! thanks for showing how to do it without islice – John Henckel Feb 03 '22 at 18:29

jpp · Answer 5 · 2018-06-07T11:06:26.373

itertools.slice is the natural way to extract n items from a generator.

But you can also implement this yourself using a helper function. Just like the itertools.slice pseudo-code, we catch StopIteration to limit the number of items yielded.

This is more adaptable because it allows you to specify logic if n is greater than the number of items in your generator.

def take_n(gen, n):
    for _ in range(n):
        try:
            yield next(gen)
        except StopIteration:
            break

g = (i**2 for i in range(5))
res = list(take_n(g, 20))

print(res)

[0, 1, 4, 9, 16]

score -1 · Answer 6 · edited Aug 08 '23 at 06:19

-1

a = [1, 2, 1, 2, 1, 2]

b = [i for i in a if i == 1][:2]

I think this creates a full list comprehension (evaluating each element in the original list) and then slices it. It probably won't have a great performance in a long list, but is easy to read, and very fast to write.

edited Aug 08 '23 at 06:19

toyota Supra

3,181
4
15
19

answered Aug 08 '23 at 02:47

jorgito

1
1

The question was *Is there a way to limit the size of the new list **at build time**?* (note the emphasis - so the intent is to stop during the iteration, not after) – WoJ Aug 08 '23 at 07:04
I agree, that's not building time, so my answer can be deleted – jorgito Aug 08 '23 at 11:44
It's not build time – Sina Rezaei Aug 10 '23 at 23:49

score -4 · Answer 7 · answered Feb 27 '17 at 11:57

-4

use enumerate:

b = [n for i,n in enumerate(a) if n==1 and i<3]

answered Feb 27 '17 at 11:57

Dorianux

1

6

That's simply wrong. First, this will discard everything except the first 3 items of `a` (the question wanted to limit `b` not `a`) and it will process the whole iterable. It won't stop after finding the third item. It just discards everything thereafter (however it will stick check the `n==1 and i < 3`). – MSeifert Feb 27 '17 at 13:16

How to limit the size of a comprehension?

7 Answers7

Linked

Related