4

I have two arrays A and B:

A=array([[ 5.,  5.,  5.],
         [ 8.,  9.,  9.]])
B=array([[ 1.,  1.,  2.],
         [ 3.,  2.,  1.]])

Anywhere there is a "1" in B I want to sum the same row and column locations in A.

So for example for this one the answer would be 5+5+9=10

I would want this to continue for 2,3....n (all unique values in B)

So for the 2's... it would be 9+5=14 and for the 3's it would be 8

I found the unique values by using:

numpy.unique(B)

I realize this make take multiple steps but I can't really wrap my head around using the index matrix to sum those locations in another matrix.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
Eric Escobar
  • 243
  • 1
  • 5
  • 11

5 Answers5

4

For each unique value x, you can do

A[B == x].sum()

Example:

>>> A[B == 1.0].sum()
19.0
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
1

[(val, np.sum(A[B==val])) for val in np.unique(B)] gives you a list of tuples where the first element is one of the unique values in B, and the second element is the sum of elements in A where the corresponding value in B is that value.

>>> [(val, np.sum(A[B==val])) for val in np.unique(B)]
[(1.0, 19.0), (2.0, 14.0), (3.0, 8.0)]

The key is that you can use A[B==val] to access items in A at positions where B equals val.

Edit: If you just want the sums, just do [np.sum(A[B==val]) for val in np.unique(B)].

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
  • How would I get the output to be: `[ 0. 19. 14. 8.]` and would this also put 0's where there was nothing in array B? – Eric Escobar Jul 27 '12 at 22:21
  • @EricEscobar, see my edited answer. If you want 0 for missing elements in B, you need to be more specific. Obviously there are infinitely many values that aren't present in B, so you need to say which values you expect to be in B. – BrenBarn Jul 28 '12 at 00:40
1

I thinknumpy.bincount is what you want. If B is an array of small integers like in you example you can do something like this:

import numpy
A = numpy.array([[ 5.,  5.,  5.],
                 [ 8.,  9.,  9.]])
B = numpy.array([[ 1,  1,  2],
                 [ 3,  2,  1]])
print numpy.bincount(B.ravel(), weights=A.ravel())
# [  0.  19.  14.   8.]

or if B has anything but small integers you can do something like this

import numpy
A = numpy.array([[ 5.,  5.,  5.],
                 [ 8.,  9.,  9.]])
B = numpy.array([[ 1.,  1.,  2.],
                 [ 3.,  2.,  1.]])
uniqB, inverse = numpy.unique(B, return_inverse=True)
print uniqB, numpy.bincount(inverse, weights=A.ravel())
# [ 1.  2.  3.] [ 19.  14.   8.]
Bi Rico
  • 25,283
  • 3
  • 52
  • 75
  • That is exactly what I was looking for but for some reason i keep getting "array cannot be safely cast to required type" do you have any idea why I would keep getting this error? Thanks! @Bago – Eric Escobar Jul 27 '12 at 22:10
  • Nevermind, I think it's because it is a float and not an int. I really did like this output though because I need it to have a "0" if there are no matches so that the column spacing doesn't get out of wack – Eric Escobar Jul 27 '12 at 22:18
  • You can always do numpy.bincount(B.ravel().astype('int'), weights=A.ravel()) if you have an array of floats that you know can safely be cast to int. – Bi Rico Jul 27 '12 at 22:50
0

I'd use numpy masked arrays. These are standard numpy arrays with a mask associated with them blocking off certain values. The process is pretty straight forward, create a masked array using

numpy.ma.masked_array(data, mask)

where mask is generated by using a masked function

mask = numpy.ma.masked_not_equal(B, 1).mask

and data is A

for i in numpy.unique(B):
    print numpy.ma.masked_array(A, numpy.ma.masked_not_equal(B, i).mask).sum()

19.0
14.0
8.0
OYRM
  • 1,395
  • 10
  • 29
0

i found old question here

one of the answer

def sum_by_group(values, groups):
 order = np.argsort(groups)
 groups = groups[order]
 values = values[order]
 values.cumsum(out=values)
 index = np.ones(len(groups), 'bool')
 index[:-1] = groups[1:] != groups[:-1]
 values = values[index]
 groups = groups[index]
 values[1:] = values[1:] - values[:-1]
 return values, groups

in your case, you can flatten your array

aflat = A.flatten()
bflat = B.flatten()
sum_by_group(aflat, bflat)
Community
  • 1
  • 1
prasastoadi
  • 2,536
  • 2
  • 16
  • 14