1

I have the following python code to sum an array with condition in another array

sum=0
for i in range(grp_num):
    if lower_bounds[i] > 0:
        sum = sum + histo1[i]

I believe numpy equivalent would be np.where(lower_bounds>0, histo1,0).sum() But the numpy method adds up everything in histo1 (ignoring requirement that lower_bounds>0). Why? Or is there another way to do this? Thanks.

juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
P. Sheih
  • 13
  • 3

1 Answers1

0

Ok, this is admittedly guesswork, but the only explanation I can think of for your np.where(lower_bounds>0, histo1,0).sum() returning the full sum is

  • you are on Python2
  • lower_bounds is a list, not an array

on Python2:

[1, 2] > 0
True

meaning that your numpy line will broadcast its first argument and always pick from histo1, never from 0. Note that the alternative formulation that was suggested in the comments histo1[lower_bounds>0].sum() will not work either (it will return histo1[1]) in this situation.

The solution. Explicitly convert lower_bounds to an array

 np.where(np.array(lower_bounds)>0, histo1, 0)

Btw. on Python3 you would get an exception

[1, 2] > 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '>' not supported between instances of 'list' and 'int'
Paul Panzer
  • 51,835
  • 3
  • 54
  • 99
  • You were right (about Python 2 and list)! As a beginner to Python, I was not discipline about data structures. Thanks. – P. Sheih Feb 18 '17 at 10:15