Finding items in one array based upon a second array

Question

I have two arrays A and B:

A=array([[ 5.,  5.,  5.],
         [ 8.,  9.,  9.]])
B=array([[ 1.,  1.,  2.],
         [ 3.,  2.,  1.]])

Anywhere there is a "1" in B I want to sum the same row and column locations in A.

So for example for this one the answer would be 5+5+9=10

I would want this to continue for 2,3....n (all unique values in B)

So for the 2's... it would be 9+5=14 and for the 3's it would be 8

I found the unique values by using:

numpy.unique(B)

I realize this make take multiple steps but I can't really wrap my head around using the index matrix to sum those locations in another matrix.

I think you mean `5+5+9=19` :-) – BrenBarn Jul 27 '12 at 18:01 — BrenBarn, Jul 27 '12 at 18:01

score 4 · Answer 1 · answered Jul 27 '12 at 17:58

4

For each unique value x, you can do

A[B == x].sum()

Example:

>>> A[B == 1.0].sum()
19.0

answered Jul 27 '12 at 17:58

Sven Marnach

574,206
118
941
841

BrenBarn · Answer 2 · 2012-07-28T00:40:01.050

1

[(val, np.sum(A[B==val])) for val in np.unique(B)] gives you a list of tuples where the first element is one of the unique values in B, and the second element is the sum of elements in A where the corresponding value in B is that value.

>>> [(val, np.sum(A[B==val])) for val in np.unique(B)]
[(1.0, 19.0), (2.0, 14.0), (3.0, 8.0)]

The key is that you can use A[B==val] to access items in A at positions where B equals val.

Edit: If you just want the sums, just do [np.sum(A[B==val]) for val in np.unique(B)].

edited Jul 28 '12 at 00:40

answered Jul 27 '12 at 18:00

BrenBarn

242,874
37
412
384

How would I get the output to be: `[ 0. 19. 14. 8.]` and would this also put 0's where there was nothing in array B? – Eric Escobar Jul 27 '12 at 22:21
@EricEscobar, see my edited answer. If you want 0 for missing elements in B, you need to be more specific. Obviously there are infinitely many values that aren't present in B, so you need to say which values you expect to be in B. – BrenBarn Jul 28 '12 at 00:40

score 1 · Answer 3 · answered Jul 27 '12 at 21:37

1

I thinknumpy.bincount is what you want. If B is an array of small integers like in you example you can do something like this:

import numpy
A = numpy.array([[ 5.,  5.,  5.],
                 [ 8.,  9.,  9.]])
B = numpy.array([[ 1,  1,  2],
                 [ 3,  2,  1]])
print numpy.bincount(B.ravel(), weights=A.ravel())
# [  0.  19.  14.   8.]

or if B has anything but small integers you can do something like this

import numpy
A = numpy.array([[ 5.,  5.,  5.],
                 [ 8.,  9.,  9.]])
B = numpy.array([[ 1.,  1.,  2.],
                 [ 3.,  2.,  1.]])
uniqB, inverse = numpy.unique(B, return_inverse=True)
print uniqB, numpy.bincount(inverse, weights=A.ravel())
# [ 1.  2.  3.] [ 19.  14.   8.]

answered Jul 27 '12 at 21:37

Bi Rico

25,283
3
52
75

That is exactly what I was looking for but for some reason i keep getting "array cannot be safely cast to required type" do you have any idea why I would keep getting this error? Thanks! @Bago – Eric Escobar Jul 27 '12 at 22:10
Nevermind, I think it's because it is a float and not an int. I really did like this output though because I need it to have a "0" if there are no matches so that the column spacing doesn't get out of wack – Eric Escobar Jul 27 '12 at 22:18
You can always do numpy.bincount(B.ravel().astype('int'), weights=A.ravel()) if you have an array of floats that you know can safely be cast to int. – Bi Rico Jul 27 '12 at 22:50

OYRM · Answer 4 · 2012-07-27T18:15:28.893

I'd use numpy masked arrays. These are standard numpy arrays with a mask associated with them blocking off certain values. The process is pretty straight forward, create a masked array using

numpy.ma.masked_array(data, mask)

where mask is generated by using a masked function

mask = numpy.ma.masked_not_equal(B, 1).mask

and data is A

for i in numpy.unique(B):
    print numpy.ma.masked_array(A, numpy.ma.masked_not_equal(B, i).mask).sum()

19.0
14.0
8.0

score 0 · Answer 5 · edited May 23 '17 at 11:52

i found old question here

one of the answer

def sum_by_group(values, groups):
 order = np.argsort(groups)
 groups = groups[order]
 values = values[order]
 values.cumsum(out=values)
 index = np.ones(len(groups), 'bool')
 index[:-1] = groups[1:] != groups[:-1]
 values = values[index]
 groups = groups[index]
 values[1:] = values[1:] - values[:-1]
 return values, groups

in your case, you can flatten your array

aflat = A.flatten()
bflat = B.flatten()
sum_by_group(aflat, bflat)

Finding items in one array based upon a second array

5 Answers5