How to count the number of true elements in a NumPy bool array

Question

I have a NumPy array 'boolarr' of boolean type. I want to count the number of elements whose values are True. Is there a NumPy or Python routine dedicated for this task? Or, do I need to iterate over the elements in my script?

For pandas: http://stackoverflow.com/questions/26053849/counting-non-zero-values-in-each-column-of-a-dataframe-in-python — Private, Apr 03 '17 at 08:16

score 343 · Accepted Answer · edited Jun 12 '21 at 17:49

343

You have multiple options. Two options are the following.

boolarr.sum()
numpy.count_nonzero(boolarr)

Here's an example:

>>> import numpy as np
>>> boolarr = np.array([[0, 0, 1], [1, 0, 1], [1, 0, 1]], dtype=np.bool)
>>> boolarr
array([[False, False,  True],
       [ True, False,  True],
       [ True, False,  True]], dtype=bool)

>>> boolarr.sum()
5

Of course, that is a bool-specific answer. More generally, you can use numpy.count_nonzero.

>>> np.count_nonzero(boolarr)
5

edited Jun 12 '21 at 17:49

Eric O. Lebigot

91,433
48
218
260

answered Dec 03 '11 at 01:22

David Alber

17,624
6
65
71

3

Thanks, David. They look neat. About the method with sum(..), is True always equal to 1 in python (or at least in numpy)? If it is not guaranteed, I will add a check, 'if True==1:' beforehand. About count_nonzero(..), unfortunately, it seems not implemented in my numpy module at version 1.5.1, but I may have a chance to use it in the future. – norio Dec 03 '11 at 01:52
5

@norio Regarding `bool`: boolean values are treated as 1 and 0 in arithmetic operations. See "[Boolean Values](http://docs.python.org/library/stdtypes.html#boolean-values)" in the Python Standard Library documentation. Note that NumPy's `bool` and Python `bool` are not the same, but they are compatible (see [here](http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html#built-in-scalar-types) for more information). – David Alber Dec 03 '11 at 04:39
1

@norio Regarding `numpy.count_nonzero` not being in NumPy v1.5.1: you are right. According to this [release announcement](http://mail.scipy.org/pipermail/numpy-discussion/2011-May/056295.html), it was added in NumPy v1.6.0. – David Alber Dec 03 '11 at 04:41
1

Thank you very much for the replies with the links! – norio Dec 03 '11 at 08:34
31

FWIW, `numpy.count_nonzero` is about a thousand times faster, in my Python interpreter, at least. `python -m timeit -s "import numpy as np; bools = np.random.uniform(size=1000) >= 0.5" "np.count_nonzero(bools)"` vs. `python -m timeit -s "import numpy as np; bools = np.random.uniform(size=1000) >= 0.5" "sum(bools)"` – chbrown Nov 19 '13 at 21:10
10

@chbrown you are right. But you should compare to `np.sum(bools)` instead! However, `np.count_nonzero(bools)` is still ~12x faster. – mab Nov 23 '15 at 18:15
If I try either of those, it works as long as my answer is non-zero. But if I get 0 and I'm doing it in a pivot table, my answer is always False. – Elliptica Aug 03 '16 at 23:46
If you intend to check if there are more 1 or more elements in the array after true values have been counted, you can do this with `np.any(bools)` – Zikoat Nov 21 '20 at 13:27
@DavidAlber `numpy.count_nonzero` returns wrong results for masked array. If masked array has some mask values and all True values in other cells, it's different from `np.sum()`. Is it bug or expected result? – discover Sep 28 '21 at 07:23

score 33 · Answer 2 · edited Sep 09 '19 at 00:03

33

That question solved a quite similar question for me and I thought I should share :

In raw python you can use sum() to count True values in a list :

>>> sum([True,True,True,False,False])
3

But this won't work :

>>> sum([[False, False, True], [True, False, True]])
TypeError...

edited Sep 09 '19 at 00:03

wuerfelfreak

2,363
1
14
29

answered Nov 26 '12 at 14:21

Guillaume Gendre

2,504
28
17

2

You should "flatten" the array of arrays first. unfortunately, there's no builtin method, see http://stackoverflow.com/questions/2158395/flatten-an-irregular-list-of-lists-in-python – tommy chheng Dec 07 '12 at 23:32
2

Thanks Guillaume! Works with Pandas dataframes as well. – JJFord3 Dec 01 '16 at 19:11
The raw built-in `sum` is much slower for Pandas `DataFrame`s and numpy arrays than their respective `sum` methods. – Elias Hasle Dec 08 '22 at 09:00

score 5 · Answer 3 · answered Jun 06 '17 at 18:29

In terms of comparing two numpy arrays and counting the number of matches (e.g. correct class prediction in machine learning), I found the below example for two dimensions useful:

import numpy as np
result = np.random.randint(3,size=(5,2)) # 5x2 random integer array
target = np.random.randint(3,size=(5,2)) # 5x2 random integer array

res = np.equal(result,target)
print result
print target
print np.sum(res[:,0])
print np.sum(res[:,1])

which can be extended to D dimensions.

The results are:

Prediction:

[[1 2]
 [2 0]
 [2 0]
 [1 2]
 [1 2]]

Target:

[[0 1]
 [1 0]
 [2 0]
 [0 0]
 [2 1]]

Count of correct prediction for D=1: 1

Count of correct prediction for D=2: 2

score 2 · Answer 4 · answered Mar 13 '21 at 20:39

b[b].size

where b is the Boolean ndarray in question. It filters b for True, and then count the length of the filtered array.

This probably isn't as efficient np.count_nonzero() mentioned previously, but is useful if you forget the other syntax. Plus, this shorter syntax saves programmer time.

Demo:

In [1]: a = np.array([0,1,3])

In [2]: a
Out[2]: array([0, 1, 3])

In [3]: a[a>=1].size
Out[3]: 2

In [5]: b=a>=1

In [6]: b
Out[6]: array([False,  True,  True])

In [7]: b[b].size
Out[7]: 2

score 0 · Answer 5 · answered Oct 25 '20 at 09:29

0

boolarr.sum(axis=1 or axis=0)

axis = 1 will output number of trues in a row and axis = 0 will count number of trues in columns so

boolarr[[true,true,true],[false,false,true]]
print(boolarr.sum(axis=1))

will be (3,1)

answered Oct 25 '20 at 09:29

Roohullah Kazmi

337
3
14

score 0 · Answer 6 · answered Mar 30 '22 at 14:44

0

For 1D array, this is what worked for me:

import numpy as np
numbers= np.array([3, 1, 5, 2, 5, 1, 1, 5, 1, 4, 2, 1, 4, 5, 3, 4, 
                  5, 2, 4, 2, 6, 6, 3, 6, 2, 3, 5, 6, 5])

numbersGreaterThan2= np.count_nonzero(numbers> 2)

answered Mar 30 '22 at 14:44

jose pablo solano

1

How to count the number of true elements in a NumPy bool array

6 Answers6

Linked