114

Assuming that I have a list with a huge number of items,

l = [ 1, 4, 6, 30, 2, ... ]

I want to get the number of items from that list, where an item satisfies a certain condition. My first thought was:

count = len([i for i in l if my_condition(l)])

But if the filtered list also has a great number of items, I think that creating a new list for the filtered result is just a waste of memory. For efficiency, IMHO, the above call can't be better than:

count = 0
for i in l:
    if my_condition(l):
        count += 1

Is there any functional-style way to get the # of items that satisfy the condition without generating a temporary list?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
cinsk
  • 1,576
  • 2
  • 12
  • 14
  • 5
    The choice between generators and lists is a choice between execution time and memory consumption. You would be surprised how often the results are counter intuitive if you profile the code. Premature optimization is the root of all evil. – Paulo Scardine Mar 13 '13 at 01:00

5 Answers5

141

You can use a generator expression:

>>> l = [1, 3, 7, 2, 6, 8, 10]
>>> sum(1 for i in l if i % 4 == 3)
2

or even

>>> sum(i % 4 == 3 for i in l)
2

which uses the fact that True == 1 and False == 0.

Alternatively, you could use itertools.imap (python 2) or simply map (python 3):

>>> def my_condition(x):
...     return x % 4 == 3
... 
>>> sum(map(my_condition, l))
2
wjandrea
  • 28,235
  • 9
  • 60
  • 81
DSM
  • 342,061
  • 65
  • 592
  • 494
31

You want a generator comprehension rather than a list here.

For example,

l = [1, 4, 6, 7, 30, 2]

def my_condition(x):
    return x > 5 and x < 20

print sum(1 for x in l if my_condition(x))
# -> 2
print sum(1 for x in range(1000000) if my_condition(x))
# -> 14

Or use itertools.imap (though I think the explicit list and generator expressions look somewhat more Pythonic).

Note that, though it's not obvious from the sum example, you can compose generator comprehensions nicely. For example,

inputs = xrange(1000000)      # In Python 3 and above, use range instead of xrange
odds = (x for x in inputs if x % 2)  # Pick odd numbers
sq_inc = (x**2 + 1 for x in odds)    # Square and add one
print sum(x/2 for x in sq_inc)       # Actually evaluate each one
# -> 83333333333500000

The cool thing about this technique is that you can specify conceptually separate steps in code without forcing evaluation and storage in memory until the final result is evaluated.

Community
  • 1
  • 1
JohnJ
  • 4,753
  • 2
  • 28
  • 40
12

This can also be done using reduce if you prefer functional programming

reduce(lambda count, i: count + my_condition(i), l, 0)

This way you only do 1 pass and no intermediate list is generated.

Fermat's Little Student
  • 5,549
  • 7
  • 49
  • 70
11

you could do something like:

l = [1,2,3,4,5,..]
count = sum(1 for i in l if my_condition(i))

which just adds 1 for each element that satisfies the condition.

Jsdodgers
  • 5,253
  • 2
  • 20
  • 36
2
from itertools import imap
sum(imap(my_condition, l))
kkonrad
  • 1,262
  • 13
  • 32