I need to find fast way to get indicies of neighbors with values like current
For example:
arr = [0, 0, 0, 1, 0, 1, 1, 1, 1, 0]
indicies = func(arr, 6)
# [5, 6, 7, 8]
6th element has value 1, so I need full slice containing 6th and all it's neighbors with same value
It is like a part of flood fill algorithm. Is there a way to do it fast in numpy? Is there a way for 2D array?
EDIT
Let's see some perfomance tests:
import numpy as np
import random
np.random.seed(1488)
arr = np.zeros(5000)
for x in np.random.randint(0, 5000, size = 100):
arr[x:x+50] = 1
I will compare function from @Ehsan:
def func_Ehsan(arr, idx):
change = np.insert(np.flatnonzero(np.diff(arr)), 0, -1)
loc = np.searchsorted(change, idx)
start = change[max(loc-1,0)]+1 if loc<len(change) else change[loc-1]
end = change[min(loc, len(change)-1)]
return (start, end)
change = np.insert(np.flatnonzero(np.diff(arr)), 0, -1)
def func_Ehsan_same_arr(arr, idx):
loc = np.searchsorted(change, idx)
start = change[max(loc-1,0)]+1 if loc<len(change) else change[loc-1]
end = change[min(loc, len(change)-1)]
return (start, end)
with my pure python function:
def my_func(arr, index):
val = arr[index]
size = arr.size
end = index + 1
while end < size and arr[end] == val:
end += 1
start = index - 1
while start > -1 and arr[start] == val:
start -= 1
return start + 1, end
Take a look:
np.random.seed(1488)
%timeit my_func(arr, np.random.randint(0, 5000))
# 42.4 µs ± 700 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
np.random.seed(1488)
%timeit func_Ehsan(arr, np.random.randint(0, 5000))
# 115 µs ± 1.92 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
np.random.seed(1488)
%timeit func_Ehsan_same_arr(arr, np.random.randint(0, 5000))
# 18.1 µs ± 953 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Is there a way to use same logic by numpy, without C module/Cython/Numba/python loops? And make it faster!