31

I'd like to calculate the mean of an array in Python in this form:

Matrice = [1, 2, None]

I'd just like to have my None value ignored by the numpy.mean calculation but I can't figure out how to do it.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
  • 4
    +1: this question can be particularly relevant for arrays that are imported from a database, where values can sometimes be NULL. – Eric O. Lebigot Nov 22 '11 at 22:30

7 Answers7

12

You are looking for masked arrays. Here's an example.

import numpy.ma as ma
a = ma.array([1, 2, None], mask = [0, 0, 1])
print "average =", ma.average(a)

From the numpy docs linked above, "The numpy.ma module provides a nearly work-alike replacement for numpy that supports data arrays with masks."

tom10
  • 67,082
  • 10
  • 127
  • 137
  • 3
    a member function that helped a lot was `filled`. that brought the masked array back to a normal array, filled with a value that I would recognize as invalid (NaN, -9999, whatever your users need). – mariotomo Apr 22 '10 at 09:20
  • 1
    Performance of masked arrays is also significantly less than regular numpy arrays as the implementation is pure Python. If you are dealing with big data, be aware of the performance implications. – timbo Dec 03 '14 at 23:37
  • Better to use numpy.nanmean than looking for ad-hoc solutions outside of numpy; see answer below. – strangeloop Jan 25 '21 at 16:42
  • Masked arrays are not ad-hoc nor outside of numpy. The docs link in my answer shows this. – tom10 Jan 28 '21 at 16:17
7

haven't used numpy, but in standard python you can filter out None using list comprehensions or the filter function

>>> [i for i in [1, 2, None] if i != None]
[1, 2]
>>> filter(lambda x: x != None, [1, 2, None])
[1, 2]

and then average the result to ignore the None

cobbal
  • 69,903
  • 20
  • 143
  • 156
  • 5
    `x != None` is usually written `x is not None` (PEP 8: "Comparisons to singletons like None should always be done with 'is' or 'is not', never the equality operators.") – Eric O. Lebigot Nov 22 '11 at 22:27
6

You can use scipy for that:

import scipy.stats.stats as st
m=st.nanmean(vec)
Noam Peled
  • 4,484
  • 5
  • 43
  • 48
4

You might also be able to kludge with values like NaN or Inf.

In [1]: array([1, 2, None])
Out[1]: array([1, 2, None], dtype=object)

In [2]: array([1, 2, NaN])
Out[2]: array([  1.,   2.,  NaN])

Actually, it might not even be a kludge. Wikipedia says:

NaNs may be used to represent missing values in computations.

Actually, this doesn't work for the mean() function, though, so nevermind. :)

In [20]: mean([1, 2, NaN])
Out[20]: nan
endolith
  • 25,479
  • 34
  • 128
  • 192
3

You can also use filter, pass None to it, it will filter non True objects, also 0, :D So, use it when you dont need 0 too.

>>> filter(None,[1, 2, None])
[1, 2]
YOU
  • 120,166
  • 34
  • 186
  • 219
3

You can 'upcast' the array to numpy's float64 dtype and then use numpy's nanmean method as in the following example:

import numpy as np

arr = [1,2,3, None]
arr2 = np.array(arr, dtype=np.float64)
print(arr2) # [ 1.  2.  3. nan]
print(np.nanmean(arr2)) # 2.0
strangeloop
  • 751
  • 1
  • 9
  • 15
-1

np.mean(Matrice[Matrice != None])

Ishan Tomar
  • 1,488
  • 1
  • 16
  • 20