-1

I have following 2D numpay array:

matrix = np.array([
    [ 0,  1,  4,  3],
    [ 1,  2,  5,  4],
    [ 3,  4,  7,  6],
    [ 4,  5,  8,  7],
    [ 2, 10, 13,  5],
    [10, 11, 14, 13],
    [ 5, 13, 16,  8],
    [13, 14, 17, 16],
    [18, 19, 22, 21],
    [19, 11, 10, 22],
    [21, 22,  1,  0],
    [22, 10,  2,  1]])

I have another array which carries the values that I want to replace inside matrix.

substitutes = np.array([ 0,  1,  2,  3,  4,  5,  6,  7,  8, 10, 11, 13, 14, 16, 17, 18, 19, 21, 22])

Find the indices of each of the substitutes inside matrix (multiple occurrences are possible):

indices = [np.argwhere(s == matrix) for s in substitutes]

Then I do:

matrix_renumbered = copy.deepcopy(matrix)

for i, indices_per_value in enumerate(indices):
    for index in indices_per_value:
        # the substitutes are replaced just by the counter i (to be contiguous)
        matrix_renumbered[index[0], index[1]] = i

Expected result:

array([[ 0,  1,  4,  3],
   [ 1,  2,  5,  4],
   [ 3,  4,  7,  6],
   [ 4,  5,  8,  7],
   [ 2,  9, 11,  5],
   [ 9, 10, 12, 11],
   [ 5, 11, 13,  8],
   [11, 12, 14, 13],
   [15, 16, 18, 17],
   [16, 10,  9, 18],
   [17, 18,  1,  0],
   [18,  9,  2,  1]])

Is there a better way (e.g. using numpy) to do what the double for-loop does?

Andy

ddejohn
  • 8,775
  • 3
  • 17
  • 30
chiefenne
  • 565
  • 2
  • 15
  • 30

1 Answers1

0

You can remove the inner for loop by taking advantage of advanced indexing:

for value, adv_idx in enumerate(tuple(zip(*i)) for i in indices):
    matrix_renumbered[adv_idx] = value

Output:

array([[ 0,  1,  4,  3],
       [ 1,  2,  5,  4],
       [ 3,  4,  7,  6],
       [ 4,  5,  8,  7],
       [ 2,  9, 11,  5],
       [ 9, 10, 12, 11],
       [ 5, 11, 13,  8],
       [11, 12, 14, 13],
       [15, 16, 18, 17],
       [16, 10,  9, 18],
       [17, 18,  1,  0],
       [18,  9,  2,  1]])

Advanced indexing lets you pass in a list of row coordinates and a list of column coordinates, and access those elements of the array:

In [1]: x = np.random.randint(0, 9, (5, 5))

In [2]: x
Out[2]:
array([[1, 2, 4, 0, 5],
       [1, 5, 7, 4, 3],
       [1, 3, 6, 8, 0],
       [6, 3, 7, 6, 3],
       [4, 3, 6, 8, 6]])

In [3]: x[[1, 2, 3], [0, 0, 0]] = 999

In [4]: x
Out[4]:
array([[  1,   2,   4,   0,   5],
       [999,   5,   7,   4,   3],
       [999,   3,   6,   8,   0],
       [999,   3,   7,   6,   3],
       [  4,   3,   6,   8,   6]])

Here, you're taking rows 1, 2, and 3, and columns 0, 0, and 0 with the advanced index x[[1, 2, 3], [0, 0, 0]].

The only thing you need to change about your indices array is that you need to zip() the coordinates into two separate lists, one for row values and another for column values (instead of a sequence of (row, col) pairs), which is achieved by this:

tuple(zip(*idx)) for idx in indices
ddejohn
  • 8,775
  • 3
  • 17
  • 30
  • Thanks for this. But, I finally implemented the [divakar 3 solution](https://stackoverflow.com/q/55949809/2264936), as it is also very fast. I didn't find this before. Your solution works as well, but seems to be even slower than the double for loop. I did some timing checks (although I won't guarantee that I did that the right way). – chiefenne Sep 10 '21 at 08:23