I'm trying to learn Numba to speed up my Python code, and I plan to use the eager compilation mode of Numba JIT.
The common-used JIT mode can be achieved in the following code:
@jit(nopython=True)
def jit_go_fast_lazy(a):
trace = 0.0
for i in range(a.shape[0]):
trace += np.tanh(a[i, i])
return a + trace
When run the program above, the first-time execution of jit_go_fast() will be slow since compilation takes much time, however the afterwards executions will be tremendously faster.
Since my program is sensitive to execution time, it's essential to reduce the overhead compilation time so I plan to choose the eager compilation mode, which, according to official tutorial, should be like this:
@jit(float64[:, :](int64[:, :]), nopython=True)
def jit_go_fast_eager(a):
trace = 0.0
for i in range(a.shape[0]):
trace += np.tanh(a[i, i])
return a + trace
(The code example here is modified from Numba official site)
When testing the running time of code above, I use the following code:
x = np.arange(100).reshape(10, 10)
start = time.time()
jit_go_fast_lazy(x)
end = time.time()
print("Elapsed = %s" % (end - start))
start = time.time()
jit_go_fast_lazy(x)
end = time.time()
print("Elapsed = %s" % (end - start))
start = time.time()
jit_go_fast_eager(x)
end = time.time()
print("Elapsed = %s" % (end - start))
start = time.time()
jit_go_fast_eager(x)
end = time.time()
print("Elapsed = %s" % (end - start))
When running this test on macOS with Python 3.9.1, the results are as follows:
First run of JIT lazy compilation: 0.14231 sec
Second run of JIT lazy compilation: 3.0994e-06 sec
First run of JIT eager compilation: 8.8930e-05 sec
Second run of JIT eager compilation: 1.9073e-06 sec
From what I understamd, the running time of jit_go_faster_eager()
should be approximately the same whether I run it in my program for the first time or not (since it has been compiled). However, the result shows that when jit_go_faster_eager()
was executed for the second time, it was about 10 times faster.
I did some search about this confusing result, and find this question. I changed the decorator of jit_go_fast_eager()
to:
@jit(float64[:, ::1](int64[:, ::1]), nopython=True)
The result is:
First run of JIT lazy compilation: 0.14407 sec
Second run of JIT lazy compilation: 3.0994e-06 sec
First run of JIT eager compilation: 6.0081e-05 sec
Second run of JIT eager compilation: 2.1458e-06 sec
There is only minor change in running time.
Can someone kindly give some help on this? My main questions are:
- Why there's huge running time difference in the first and second execution of JIT eager compiled function?
- What's the best practice to accelerate a function like this? While the second executions are always much faster, the first execution can interfere with the overall process of the progarm.
Thanks!