2

I have a relatively simple problem that I cannot solve without using loops. It is difficult for me to figure out the correct title for this problem. Lets say we have two numpy arrays:

array_1 = np.array([[0, 1, 2],
                    [3, 3, 3],
                    [3, 3, 4],
                    [3, 6, 2]])

array_2 = np.array([[0, 0, 0], 
                    [1, 1, 1],
                    [2, 2, 2],
                    [3, 3, 3],
                    [4, 4, 4],
                    [5, 5, 5],
                    [6, 6, 6]])

array_1 represents indices of the rows in array_2 that we want to sum. So for example, 4th row in result array should contain summed all rows in array_2 that have same row indices as all 3s in array_1.

It is much easier to understand it in the code:

result = np.empty(array_2.shape)

for i in range(array_1.shape[0]):
    for j in range(array_1.shape[1]):
        index = array_1[i, j]
        result[index] = result[index] + array_2[i]

Result should be:

[[ 0  0  0]
 [ 0  0  0]
 [ 3  3  3]
 [10 10 10]
 [ 2  2  2]
 [ 0  0  0]
 [ 3  3  3]]

I tried to use np.einsum but I need to use both elements in array as indices and also its rows as indices so I'm not sure if np.einsum is the best path here.

This is the problem I have in graphics. array_1 represent indices of vertices for triangles and array_2 represents normals where index of a row corresponds to the index of the vertex

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • Problem is clear, but I am curious: why would you want to solve this without loops? – sarema Jan 11 '22 at 21:16
  • This is the problem I have in graphics so `array_1` and `array_2` can be massive and this can get extremely slow. But what bugs me the most with this problem is that there are just two simple for loops and I can't port them to numpy. It is making my brain boil for 3rd day – Nemanja Stojanovic Jan 11 '22 at 21:19
  • "and I can't port them to numpy.". Why do you want to do this? Do you want to speed up the algorithm? What is it you are trying to achieve at the end of the day? – sarema Jan 11 '22 at 21:21
  • Yes. I want to speed it up – Nemanja Stojanovic Jan 11 '22 at 21:22
  • Shouldn't the outupt have the same size as `array1`, not `array2`? – Mad Physicist Jan 11 '22 at 21:26
  • @sarema. If you vectorize this, it will run 10-100x faster – Mad Physicist Jan 11 '22 at 21:27
  • I know that. I just wanted to make sure this is actually his goal – sarema Jan 11 '22 at 21:28
  • Just to make sure I'm reading it right, first row of output is `array2[0] + array2[1] + array2[2]`, since `array1[0]` is `[0, 1, 2]`? – Mad Physicist Jan 11 '22 at 21:28
  • No first row is summed all rows that have same indices as `0`s in `array_1`, so first row should be `[0, 0, 0]`, second row should be also `[0, 0, 0]`, third row `[3, 3, 3]`, fourth `[10, 10, 10]`... Fourth is `[10, 10, 10]` because `3` happens the most – Nemanja Stojanovic Jan 11 '22 at 21:32
  • OK. So the output is actually `(array_1.max() + 1, array_2.shape[1])` in size, since the shape of `array_2` has nothing to do with it. You never select a row past the fourth, since `array_1` only has four rows. So for example, `result[6] == [3, 3, 3]` because the only `6` is in row `3`. – Mad Physicist Jan 11 '22 at 21:38
  • No. Output array has to have the same size as the `array_2`, that's why I put or purpose the max value to be 6 in `array_1`. Output array should have 7 rows, same as `array_2` – Nemanja Stojanovic Jan 11 '22 at 21:40
  • Since `i` is used in `array_2[i]` in the inner loop, and `i` goes from 0 to 3 (inclusive), then only the first four rows of `array_2` are actually used in the `result` array. Is this correct? – Breno Jan 11 '22 at 21:46
  • 1
    Get the looped version working correctly and show the results. Then we can talk about doing the same thing with the full power of numpy. – hpaulj Jan 11 '22 at 21:53

1 Answers1

5

Any time you're adding something from a repeated index, normal ufuncs like np.add don't work out of the box because they only process a repeated fancy index once. Instead, you have to use the unbuffered version, which is np.add.at.

Here, you have a pair of indices: the row in array_1 is the row index into array_2, and the element of array_1 is the row index into the output.

First, construct the indices explicitly as fancy indices. This will make it much simpler to use them:

output_row = array_1.ravel()
input_row = np.repeat(np.arange(array_1.shape[0]), array_1.shape[1]).ravel()

You can apply input_row directly to array_2, but you need add.at to use output_row:

output = np.zeros_like(array_2)
np.add.at(output, output_row, array_2[input_row])

You really only use the first four rows of array_2, so it could be truncated to

array_2 = array2[:array_1.shape[0]]

In that case, you would want to initialize the output as:

output = np.zeros_like(array_2, shape=(output_row.max() + 1, array2.shape[1]))
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • That's not correct. The shape of the result is `(4, 3)`, which is not what the OP is looking for. – sarema Jan 11 '22 at 21:34
  • @sarema. See update. I have the same answer as OP now. – Mad Physicist Jan 11 '22 at 21:57
  • 1
    Wow. I will have to reread this answer a couple of times to fully understand it. Thank You!. Anyway, I have no idea what title to put on this question. Any help would be appreciated! – Nemanja Stojanovic Jan 11 '22 at 22:03
  • 1
    I see. Impressive! – sarema Jan 11 '22 at 22:03
  • 2
    @NemanjaStojanovic. I've found fancy indexing to be much more counterintuitive than `np.ufunc.at`. All it does is allow you to do `output[output_row] += array_2[input_row]` correctly. I would recommend two things: first forget that the arrays are 2D and work with a 1D example, then read the docs on fancy indexing: https://numpy.org/doc/stable/user/basics.indexing.html#integer-array-indexing – Mad Physicist Jan 11 '22 at 22:10