A good way to find out why something is slower is to profile it. I'll use the 3rd party library line_profiler
and the IPython command %lprun
(see for example this blog) here:
%load_ext line_profiler
import numpy as np
arr = np.full((3, 3), -9999, dtype=float)
%lprun -f np.ma.average np.ma.average(arr, axis=0)
Line # Hits Time Per Hit % Time Line Contents
==============================================================
519 def average(a, axis=None, weights=None, returned=False):
...
570 1 1810 1810.0 30.5 a = asarray(a)
571 1 15 15.0 0.3 m = getmask(a)
572
573 # inspired by 'average' in numpy/lib/function_base.py
574
575 1 5 5.0 0.1 if weights is None:
576 1 3500 3500.0 59.0 avg = a.mean(axis)
577 1 591 591.0 10.0 scl = avg.dtype.type(a.count(axis))
578 else:
...
608
609 1 7 7.0 0.1 if returned:
610 if scl.shape != avg.shape:
611 scl = np.broadcast_to(scl, avg.shape).copy()
612 return avg, scl
613 else:
614 1 5 5.0 0.1 return avg
I removed some irrelevant lines.
So actually 30% of the time is spent in np.ma.asarray
(something that arr.mean
doesn't have to do!).
However the relative times change drastically if you use a bigger array:
arr = np.full((1000, 1000), -9999, dtype=float)
%lprun -f np.ma.average np.ma.average(arr, axis=0)
Line # Hits Time Per Hit % Time Line Contents
==============================================================
519 def average(a, axis=None, weights=None, returned=False):
...
570 1 609 609.0 7.6 a = asarray(a)
571 1 14 14.0 0.2 m = getmask(a)
572
573 # inspired by 'average' in numpy/lib/function_base.py
574
575 1 7 7.0 0.1 if weights is None:
576 1 6924 6924.0 86.9 avg = a.mean(axis)
577 1 404 404.0 5.1 scl = avg.dtype.type(a.count(axis))
578 else:
...
609 1 6 6.0 0.1 if returned:
610 if scl.shape != avg.shape:
611 scl = np.broadcast_to(scl, avg.shape).copy()
612 return avg, scl
613 else:
614 1 6 6.0 0.1 return avg
This time the np.ma.MaskedArray.mean
function almost takes up 90% of the time.
Note: You could also dig deeper and look into np.ma.asarray
or np.ma.MaskedArray.count
or np.ma.MaskedArray.mean
and check their line profilings. But I just wanted to show that there are lots of called function that add to the overhead.
So the next question is: did the relative times between np.ndarray.mean
and np.ma.average
also change? And at least on my computer the difference is much lower now:
%timeit np.ma.average(arr, axis=0)
# 2.96 ms ± 91 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit arr.mean(axis=0)
# 1.84 ms ± 23.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
This time it's not even 2 times slower. I assume for even bigger arrays the difference will get even smaller.
This is also something that is actually quite common with NumPy:
The constant factors are quite high even for plain numpy functions (see for example my answer to the question "Performance in different vectorization method in numpy"). For np.ma
these constant factors are even bigger, especially if you don't use a np.ma.MaskedArray
as input. But even though the constant factors might be high, these functions excel with big arrays.