NumPys functions are designed for arrays not for single values or scalars. They have a rather high overhead because they do several checks and conversions that will provide a speed benefit for big arrays but these are costly for scalars.
The conversion is really obvious if you check the type of the return:
>>> import numpy as np
>>> import math
>>> type(np.log(2.))
numpy.float64
>>> type(math.log(2.))
float
On the other hand the math
-module is optimized for scalars. So they don't need that many checks (I think there are only two: Convert to float
and check is it's <= 0
). Which is why math.log
is faster for scalars compared to numpy.log
.
But if you operate on arrays and want to take the logarithm of all elements in the array NumPy can be much faster. On my computer if I time the execution of np.log
on an array compared to math.log
of each item in a list then the timing looks different:
arr = np.arange(1, 10000000)
%timeit np.log(arr)
201 ms ± 959 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
lst = arr.tolist()
%timeit [math.log(item) for item in lst]
8.77 s ± 63.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
So np.log
will be many orders of magnitude faster on arrays (it's more than 40 times faster in this case)! And you don't need to write any loop yourself. As ufunc np.log
will also correctly work on multidimensional numpy arrays and also allows to do the operation inplace.
As a rule of thumb: If you have an array with thousands of items NumPy will be faster, if you have scalars or only a few dozen items math
+ explicit loop will be faster.
Also don't use time
for timing code. There are dedicated modules that give more accurate results, better statistics and disable garbage collection during the timings:
I generally use %timeit
which is a convenient wrapper around the timeit
functionality, but it requires IPython. They already conveniently display the result mean and deviation and do some (mostly) useful statistics like displaying the "best of 7" or "best of 3" result.
I recently analyzed the runtime behaviour of numpy functions for another question, some of the points also apply here.