python replace value in array based on previous and following value in column

Question

given the following array, I want to replace the zero with their previous value columnwise as long as it is surrounded by two values greater than zero. I am aware of np.where but it would consider the whole array instead of its columns. I am not sure how to do it and help would be appreciated.

This is the array:

a=np.array([[4, 3, 3, 2],
            [0, 0, 1, 2],
            [0, 4, 2, 4],
            [2, 4, 3, 0]])

and since the only zero that meets this condition is the second row/second column one, the new array should be the following

new_a=np.array([[4, 3, 3, 2],
               [0, 3, 1, 2],
               [0, 4, 2, 4],
               [2, 4, 3, 0]])

How do I accomplish this?

And what if I would like to extend the gap surrounded by nonzero ? For instance, the first column contains two 0 and the second column contains one 0, so the new array would be

new_a=np.array([[4, 3, 3, 2],
               [4, 3, 1, 2],
               [4, 4, 2, 4],
               [2, 4, 3, 0]])

In short, how do I solve this if the columnwise condition would be the one of having N consecutive zeros or less?

Surrounded how? top/down? top/down/left/right? Any 7 neighbor? — mozway, Sep 28 '22 at 08:53

mozway · Answer 1 · 2022-09-28T11:39:12.967

1

As a generic method, I would approach this using a convolution:

from scipy.signal import convolve2d

# kernel for top/down neighbors
kernel = np.array([[1],
                   [0],
                   [1]])
# is the value a zero?
m1 = a==0
# count non-zeros neighbors
m2 = convolve2d(~m1, kernel, mode='same') > 1

mask = m1&m2

# replace matching values with previous row value
a[mask] = np.roll(a, 1, axis=0)[mask]

output:

array([[4, 3, 3, 2],
       [0, 3, 1, 2],
       [0, 4, 2, 4],
       [2, 4, 3, 0]])

filling from surrounding values

Using pandas to benefit from ffill/bfill (you can forward-fill in pure numpy but its more complex):

import pandas as pd
df = pd.DataFrame(a)

# limit for neighbors
N = 2

# identify non-zeros
m = df.ne(0)
# mask zeros
m2 = m.where(m)
# mask for values with 2 neighbors within limits
mask = m2.ffill(limit=N) & m2.bfill(limit=N)
df.mask(mask&~m).ffill()

array([[4, 3, 3, 2],
       [4, 3, 1, 2],
       [4, 4, 2, 4],
       [2, 4, 3, 0]])

edited Sep 28 '22 at 11:39

answered Sep 28 '22 at 09:00

mozway

194,879
13
39
75

your solution works for this example, but if I would like to increase the threshold to >2 or 3,4.... changing the parameter in m2 does not work. What needs to be done in this case? – user18735627 Sep 28 '22 at 09:09
Can you provide an example? – mozway Sep 28 '22 at 09:14
let's for instance say that the value in between has to be 0 but the adjacent values columnwise must be greater than 2 or greater than 3 etc.... how would this code change? – user18735627 Sep 28 '22 at 09:17
I see, then `m2 = convolve2d(a>3, kernel, mode='same') > 1` for greater than 3 – mozway Sep 28 '22 at 09:19
Thanks. And what if you want to widen the gap? e.g. 2 or less zeros? This way, the first column would be 4,4,4,2 and the second 3,3,4,4 while the 3rd and 4th columns would remain the same ? – user18735627 Sep 28 '22 at 09:46
Again, a specific example would be more explicit. But you can use any kernel size for the convolution – mozway Sep 28 '22 at 09:49
that's the example, but in the first column there are 2 zeros, so what if I want to extend the condition to include two or N consecutive zeros instead of just one ? I have tried to enlarge the kernel but it did not work – user18735627 Sep 28 '22 at 10:21
Please edit the question with a clear example – mozway Sep 28 '22 at 10:51
ok, I have just edited it – user18735627 Sep 28 '22 at 11:00
@user18735627 while you can do this in pure numpy it's a bit tedious, I would use pandas for that (see update) – mozway Sep 28 '22 at 11:32

score 0 · Answer 2 · answered Sep 28 '22 at 09:16

That's one solution I found. I know it's basic but I think it works.

a=np.array([[4, 3, 3, 2],
            [0, 0, 1, 2],
            [0, 4, 2, 4],
            [2, 4, 3, 0]])
a_t = a.T

for i in range(len(a_t)):
    ar = a_t[i]
    for j in range(len(ar)-1):
        if (j>0) and (ar[j] == 0) and (ar[j+1] > 0):
            a_t[i][j] = a_t[i][j-1]
a = a_t.T

python replace value in array based on previous and following value in column

2 Answers2

filling from surrounding values