np.mean() vs np.average() in Python NumPy?

Question

I notice that

In [30]: np.mean([1, 2, 3])
Out[30]: 2.0

In [31]: np.average([1, 2, 3])
Out[31]: 2.0

However, there should be some differences, since after all they are two different functions.

What are the differences between them?

Actually, the documentation doesn't make it immediately clear, as far as I can see. Not saying it is impossible to tell, but I think this question is valid for Stack Overflow all the same. — BlackVegetable, Nov 18 '13 at 17:47
@joaquin: "Compute the arithmetic mean along the specified axis." vs "Compute the weighted average along the specified axis."? — Blender, Nov 19 '13 at 00:01
@Blender right. I was just trying to make a kind of funny response to your comment because if I follow your instructions the first thing I read in the [docs for numpy.mean](http://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html) is *numpy.mean : Returns the average of the array elements* which is funny if you are looking for the answer to the OP question. — joaquin, Nov 19 '13 at 16:05

Hammer · Accepted Answer · 2017-11-27T21:47:54.117

234

np.average takes an optional weight parameter. If it is not supplied they are equivalent. Take a look at the source code: Mean, Average

np.mean:

try:
    mean = a.mean
except AttributeError:
    return _wrapit(a, 'mean', axis, dtype, out)
return mean(axis, dtype, out)

np.average:

...
if weights is None :
    avg = a.mean(axis)
    scl = avg.dtype.type(a.size/avg.size)
else:
    #code that does weighted mean here

if returned: #returned is another optional argument
    scl = np.multiply(avg, 0) + scl
    return avg, scl
else:
    return avg
...

edited Nov 27 '17 at 21:47

answered Nov 18 '13 at 17:51

Hammer

10,109
1
36
52

82

Why do they offer two different functions? Seems they should just offer `np.average` since `weights` is already optional. Seems unnecessary and only serves to confuse users. – Geoff Nov 30 '15 at 22:03
12

@Geoff I would rather have them throw a NotImplementedException for "average", to educate users that the arithmetic mean is not identical to "the average". – FooBar Jun 26 '18 at 11:15

score 47 · Answer 2 · answered Nov 18 '13 at 17:50

47

np.mean always computes an arithmetic mean, and has some additional options for input and output (e.g. what datatypes to use, where to place the result).

np.average can compute a weighted average if the weights parameter is supplied.

answered Nov 18 '13 at 17:50

Amber

507,862
82
626
550

G M · Answer 3 · 2018-12-06T15:11:44.057

32

In some version of numpy there is another imporant difference that you must be aware:

average do not take in account masks, so compute the average over the whole set of data.

mean takes in account masks, so compute the mean only over unmasked values.

g = [1,2,3,55,66,77]
f = np.ma.masked_greater(g,5)

np.average(f)
Out: 34.0

np.mean(f)
Out: 2.0

edited Dec 06 '18 at 15:11

answered Aug 05 '16 at 07:40

G M

20,759
10
81
84

3

Note: `np.ma.average` works. Also, there is a [bug report](https://github.com/numpy/numpy/issues/7330). – Neil G Mar 29 '17 at 01:53
2

`np.average` and `np.mean` both takes into account masks. I've tried and got the value of "Out: `2.0`" – CEB Jun 30 '22 at 14:40
@CEB the new version probably fix the bug thanks for reporting – G M Jun 30 '22 at 16:42

Grant Petty · Answer 4 · 2020-05-19T17:28:38.423

In addition to the differences already noted, there's another extremely important difference that I just now discovered the hard way: unlike np.mean, np.average doesn't allow the dtype keyword, which is essential for getting correct results in some cases. I have a very large single-precision array that is accessed from an h5 file. If I take the mean along axes 0 and 1, I get wildly incorrect results unless I specify dtype='float64':

>T.shape
(4096, 4096, 720)
>T.dtype
dtype('<f4')

m1 = np.average(T, axis=(0,1))                #  garbage
m2 = np.mean(T, axis=(0,1))                   #  the same garbage
m3 = np.mean(T, axis=(0,1), dtype='float64')  # correct results

Unfortunately, unless you know what to look for, you can't necessarily tell your results are wrong. I will never use np.average again for this reason but will always use np.mean(.., dtype='float64') on any large array. If I want a weighted average, I'll compute it explicitly using the product of the weight vector and the target array and then either np.sum or np.mean, as appropriate (with appropriate precision as well).

Very surprising. Do you know why this happens, and can you file a bug report? Thanks — Sanjay Manohar, Sep 22 '20 at 13:48

score 4 · Answer 5 · answered Nov 18 '13 at 17:50

4

In your invocation, the two functions are the same.

average can compute a weighted average though.

Doc links: mean and average

answered Nov 18 '13 at 17:50

Prashant Kumar

20,069
14
47
63

np.mean() vs np.average() in Python NumPy?

5 Answers5

Linked

Related