So I was testing the speeds of two versions of the same function; one with reversing the view of a numpy array twice and one without. The code is as follows:
import numpy as np
from numba import njit
@njit
def min_getter(arr):
if len(arr) > 1:
result = np.empty(len(arr), dtype = arr.dtype)
local_min = arr[0]
result[0] = local_min
for i in range(1,len(arr)):
if arr[i] < local_min:
local_min = arr[i]
result[i] = local_min
return result
else:
return arr
@njit
def min_getter_rev1(arr1):
if len(arr1) > 1:
arr = arr1[::-1][::-1]
result = np.empty(len(arr), dtype = arr.dtype)
local_min = arr[0]
result[0] = local_min
for i in range(1,len(arr)):
if arr[i] < local_min:
local_min = arr[i]
result[i] = local_min
return result
else:
return arr1
size = 500000
x = np.arange(size)
y = np.hstack((x[::-1], x))
y_min = min_getter(y)
yrev_min = min_getter_rev1(y)
Surprisingly, the one with an extra operation runs slightly faster on multiple occasions. I used %timeit
around 10 times on both functions; tried different size of the array, and the difference is apparent(at least in my computer). The runtime of min_getter
is around:
2.35 ms ± 58.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
(sometimes it is 2.33 and sometimes it is 2.37 but never goes below 2.30)
and the runtime of min_getter_rev1
is around:
2.22 ms ± 23.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
(sometimes it is 2.25 and sometimes it is 2.23, but rarely goes above 2.30)
Any ideas on why and how this happened? The speed difference is like 4-6% increase, which can be a big deal in some applications. The underlying mechanism fo the speed-up may help speeding up some jitted codes potentially
Note1: I've tried size=5000000 and tested 5-10 times on each function, and the difference is even more apparent. The faster one runs at 23.2 ms ± 51.7 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
and the slower one is at 24.4 ms ± 234 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Note2: The versions of numpy
and numba
during the tests are 1.16.5
and 0.45.1
; python version is 3.7.4
; IPython
version is 7.8.0
; Python IDE used is spyder
. The test results may differ in different versions.