0

Let's say I have 2 arrays A and B where the values in A correspond to the index in the first dimension of B

A = np.array([[ 0,  3,  4], 
              [ 5,  6,  0],
              [ 0,  1,  9],
              [11, 12, 13]])

B = np.array([[0, 1, 1],
              [1, 2, 0],
              [0, 2, 3],
              [3, 3, 3],
              [0, 3, 1],
              [3, 1, 1],
              [1, 1, 0]])

For each row j of B, I want to sum the elements in A that correspond to A[B[j, i], i], to get a new array of shape 7. I can do it with loops like this:

output_array = np.zeros(B.shape[0])
for i in range(A.shape[1]):
    for j in range(B.shape[0]):
        output_array[j] += A[B[j, i], i]

which would give me as output_array

array([ 6., 10., 14., 36., 12., 17., 15.])

Is there any fast and efficient way to do the same for large arrays without having to loop?

The closest answer I could find here was this one, but I need to actually do a sum inside one of the loop and I wasn't able to extend to my case.

SadCactus
  • 71
  • 6

5 Answers5

1

Using fancy indexing:

out = A[B,
  np.repeat(np.arange(A.shape[1])[None],
            B.shape[0], axis=0)
 ].sum(axis=1)

Output:

array([ 6, 10, 14, 36, 12, 17, 15])

This works by using arrays indices of the shape of B to index A, thus maintaining the shape of B as output.

Intermediates:

np.repeat(np.arange(A.shape[1])[None], B.shape[0], axis=0)

array([[0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [0, 1, 2],
       [0, 1, 2]])


out = A[B,
  np.repeat(np.arange(A.shape[1])[None],
            B.shape[0], axis=0)
 ]

array([[ 0,  6,  0],
       [ 5,  1,  4],
       [ 0,  1, 13],
       [11, 12, 13],
       [ 0, 12,  0],
       [11,  6,  0],
       [ 5,  6,  4]])
mozway
  • 194,879
  • 13
  • 39
  • 75
1

You can make it simple as follows:

def numpy_solution(A, B):
    return A[B, np.arange(A.shape[1])].sum(axis=1)

However, this creates a temporary array of the same size as B (20000, 500), which degrades performance.

To get around this, and this may seem counterintuitive, you can actually make it faster by keeping one loop.

def keep_i_loop(A, B):
    output_array = np.zeros(B.shape[0], dtype=A.dtype)
    for i in range(A.shape[1]):
        output_array += A[B[:, i], i]
    return output_array

Furthermore, in terms of speed, this is a calculation that Numba is better at.

import numba


@numba.njit("int32[:](int32[:,:],int32[:,:])")
def numba_solution(A, B):
    output_array = np.zeros(B.shape[0], dtype=A.dtype)
    # Note that I have swapped i and j loops.
    for j in range(B.shape[0]):
        for i in range(A.shape[1]):
            output_array[j] += A[B[j, i], i]
    return output_array

Here is the benchmark code and results.

import timeit
import numpy as np


def baseline(A, B):
    output_array = np.zeros(B.shape[0])
    for i in range(A.shape[1]):
        for j in range(B.shape[0]):
            output_array[j] += A[B[j, i], i]
    return output_array


def benchmark():
    rng = np.random.default_rng(0)
    A = rng.integers(0, 1000, size=(2000, 500), dtype=np.int32)
    B = rng.integers(0, 2000, size=(20000, 500), dtype=np.int32)

    expected = baseline(A, B)
    assert np.all(expected == numpy_solution(A, B))
    assert np.all(expected == keep_i_loop(A, B))
    assert np.all(expected == numba_solution(A, B))

    n = 100
    r = timeit.timeit(lambda: numpy_solution(A, B), number=n) / n
    print(f"numpy_solution: {r * 1000:.0f} ms")

    r = timeit.timeit(lambda: keep_i_loop(A, B), number=n) / n
    print(f"keep_i_loop: {r * 1000:.0f} ms")

    r = timeit.timeit(lambda: numba_solution(A, B), number=n) / n
    print(f"numba_solution: {r * 1000:.0f} ms")


if __name__ == "__main__":
    benchmark()
numpy_solution: 66 ms
keep_i_loop: 46 ms
numba_solution: 15 ms
ken
  • 1,543
  • 1
  • 2
  • 14
0

The use of vectorized operations and NumPy's enhanced indexing will enable you to accomplish this without using explicit loops. The way to do this is to establish a set of indices for each of the two dimensions, use those indices to access array A, and then compute the sum along that specific dimension. Here is how to go about it:

import numpy as np

A = np.array([[ 0,  3,  4],
              [ 5,  6,  0],
              [ 0,  1,  9],
              [11, 12, 13]])

B = np.array([[0, 1, 1],
              [1, 2, 0],
              [0, 2, 3],
              [3, 3, 3],
              [0, 3, 1],
              [3, 1, 1],
              [1, 1, 0]])

# Create indices for both dimensions using broadcasting
row_indices = np.arange(B.shape[0])[:, np.newaxis]
column_indices = B

# Index into A using the indices arrays
result = A[column_indices, row_indices]

# Sum along the specified dimension (axis=1)
output_array = np.sum(result, axis=1)

print(output_array)

Output:

array([ 6, 10, 14, 36, 12, 17, 15])
0
# Create a column index array for A using np.arange and np.tile
column_index_array = np.tile(np.arange(A.shape[1]), B.shape[0]).reshape(B.shape)

# Create a row index array for A using np.arange and broadcasting
row_index_array = np.arange(B.shape[0])[:, None]

# Iterate B's indices using row and column indices to fetch corresponding values from A
output_array = A[B, column_index_array].sum(axis=1)
0

With short np.trace routine to sum along diagonals of the "index" array:

out_arr = np.trace(A[B], axis1=1, axis2=2)

array([ 6, 10, 14, 36, 12, 17, 15])
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105