Applying the numpy solution in:
Most efficient way to forward-fill NaN values in numpy array
def foo2(arr):
idx=np.where(arr==0,0,np.arange(len(arr)))
idx=np.maximum.accumulate(idx)
return arr[idx]
def foo1(arr):
arr = arr.copy()
for i in range(len(arr)):
if arr[i]==0:
arr[i] = arr[i-1]
return arr
For your test array, arr
, the speed improvement is modest:
In [67]: timeit foo1(arr)
100000 loops, best of 3: 18.1 µs per loop
In [68]: timeit foo2(arr)
The slowest run took 1387.12 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 11.4 µs per loop
But with a larger one, the loop increases with size, the array version barely changes:
In [69]: arr1=np.concatenate((arr,arr,arr,arr,arr,arr,arr))
In [70]: timeit foo1(arr1)
10000 loops, best of 3: 116 µs per loop
In [71]: timeit foo2(arr1)
The slowest run took 4.16 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 14.6 µs per loop
The details of the idx
construction:
In [72]: idx=np.arange(len(arr))
In [73]: idx[arr==0]=0
In [74]: idx
Out[74]:
array([ 0, 1, 2, 3, 4, 5, 6, 0, 8, 9, 0, 0, 12, 13, 0, 0, 0, 17, 18, 0, 0, 0, 0, 0, 0, 25])
In [75]: idx=np.maximum.accumulate(idx)
In [76]: idx
Out[76]:
array([ 0, 1, 2, 3, 4, 5, 6, 6, 8, 9, 9, 9, 12, 13, 13, 13, 13, 17, 18, 18, 18, 18, 18, 18, 18, 25], dtype=int32)