1

Here is an example of the situation I want to find an efficient solution for:

import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = np.array([0,1,2])

The vector b should have one integer entry for each row of a, ranging between 0 and the number of columns of a. The desired output c is a matrix in the shape of a, with i-th row containing every entry up to (and excluding) the b[i]-th entry of i-th row of a, the rest of the row should be filled up with zeros. Hence, in the example given, we look for the following solution:

c = np.array([[0,0,0],[4,0,0],[7,8,0]])

There are two "easy" ways to do this:

c = np.zeros(a.shape)
for i in range(a.shape[0]):
    c[i, :b[i]] =  a[i, :b[i]]

another is by first defining an auxiliary matrix and consequently broadcasting it against a.

aux = np.zeros(a.shape)
for i in range(a.shape[0]):
    aux[i, :b[i]] = 1 
c = a * aux

What I am looking for are vectorized solutions that scale well in runtime when I increase the number of rows of a.

H1ghfiv3
  • 111
  • 2
  • Does this help: https://stackoverflow.com/a/67173501/7831421 . Note that in this answer `use = b[:, None]` and `have` is your `a` – Amit Vikram Singh Apr 20 '21 at 14:28
  • Thanks for the hint. Using the link, I came up with the following idea: First ```n, m = a.shape```, then ```aux = np.repeat(np.arange(m)[None,:], n, axis = 0) < b```, and finally ```c = a * aux```. However, my knowledge in python is far too limited to tell whether this is the solution that scales best in runtime when I increase the number of rows. – H1ghfiv3 Apr 20 '21 at 15:12
  • 1
    Yes. That should do the job if you use `b[:, None]` is place of `b` while creating `aux`. – Amit Vikram Singh Apr 20 '21 at 15:17
  • 2
    @H1ghfiv3 You need to reshape the `b` so it gets broadcasted correctly: `aux = np.repeat(np.arange(m)[None,:], n, axis = 0) < b.reshape(-1, 1)` (try with non-square matrix `a`) – perl Apr 20 '21 at 15:19
  • 1
    Right, I mixed up the input shape of b with the (correct) one that I've tested the code with on my pc. Thanks – H1ghfiv3 Apr 20 '21 at 15:25

2 Answers2

1

Use numpy broadcasting to create a boolean mask which is False for the indices > b[i] i.e. indices where a should be replaced by 0. Then multiply a with the mask to get c.

n, m = a.shape
mask = np.repeat(np.arange(m)[None, :], n, axis = 0) < b[:, None]
c = mask * a

Output:

>>> c
array([[0, 0, 0],
       [4, 0, 0],
       [7, 8, 0]])
Amit Vikram Singh
  • 2,090
  • 10
  • 20
1

As an option, you can make a mask matrix and multiply a element-wise by that mask:

a * (np.mgrid[:a.shape[0], :a.shape[1]][1] < b.reshape(-1, 1))

Output:

array([[0, 0, 0],
       [4, 0, 0],
       [7, 8, 0]])

Here's step-by-step how the mask is constructed:

np.mgrid[:a.shape[0], :a.shape[1]][1]
# array([[0, 1, 2],
#        [0, 1, 2],
#        [0, 1, 2]])

np.mgrid[:a.shape[0], :a.shape[1]][1] < b.reshape(-1, 1)
# array([[False, False, False],
#        [ True, False, False],
#        [ True,  True, False]])
perl
  • 9,826
  • 1
  • 10
  • 22