Select rows based on a condition in numpy/python

Question

I generate a random matrix of normal distribution and size 4x4; then I have to select rows whose sum is greater than 0.

When I write the code using 2D indexing, the output doesn't seem right:

a = np.random.randn(4, 4)
a[a[:, 0] > 0]

What I am missing?

rows whose sum is greater than 0 – Andrewgorn Dec 01 '21 at 14:03 — Andrewgorn, Dec 01 '21 at 14:03

score 3 · Answer 1 · answered Dec 01 '21 at 13:54

a = np.random.randn(4, 4)
print(a)

which in this case gives:

[[-0.73576686 -0.34940161 -0.87025271 -0.61287421]
 [ 1.2738813  -0.3855836  -1.55570664  0.43841268]
 [-1.63614248  1.4127681   0.37276815 -0.35188628]
 [ 0.18570751 -0.31197874 -2.05487768 -0.05619158]]

and then apply the condition:

a[np.sum(a, axis = 0)>0,:]

which here results in:

[[ 1.2738813 , -0.3855836 , -1.55570664,  0.43841268]]

Edit: For a bit of explanation, np.sum(a, axis = 0)>0 creates a 1D Boolean mask. We then apply this to the rows of a using index slicing as a[np.sum(a, axis = 0)>0,:].

score 1 · Accepted Answer · edited Dec 03 '21 at 11:29

1

Try using np.where combined with np.sum:

import numpy as np

np.random.seed(0)
a = np.random.randn(4, 4)
indices = np.where(np.sum(a, axis=1) > 0)
print(a[indices])  # rows with sum > 0

edited Dec 03 '21 at 11:29

Ali_Sh

2,667
3
43
66

answered Dec 01 '21 at 13:54

waykiki

914
2
9
19

Select rows based on a condition in numpy/python

2 Answers2

Linked