3

Possible Duplicate:
avarage of a number of arrays with numpy without considering zero values

I am working on numpy and I have a number of arrays with the same size and shape. They are 500*500. It has some Null values. I want to have an array that is result of one by one element average of my original arrays. For example:

A=[ 1 Null 8 Null; Null 4 6 1]
B=[ 8 5 8 Null; 5 9 5 3]

the resulting array should be like:

C=[ 4.5 5 8 Null; 5 6.5 5.5 2]

How can I do that?

Community
  • 1
  • 1
f.ashouri
  • 5,409
  • 13
  • 44
  • 52

2 Answers2

7

Update: As of NumPy 1.8, you could use np.nanmean instead of scipy.stats.nanmean.


If you have scipy, you could use scipy.stats.nanmean:

In [2]: import numpy as np

In [45]: import scipy.stats as stats

In [3]: nan = np.nan

In [43]: A = np.array([1, nan, 8, nan, nan, 4, 6, 1])   
In [44]: B = np.array([8, 5, 8, nan, 5, 9, 5, 3])  
In [46]: C = np.array([A, B])    
In [47]: C
Out[47]: 
array([[  1.,  nan,   8.,  nan,  nan,   4.,   6.,   1.],
       [  8.,   5.,   8.,  nan,   5.,   9.,   5.,   3.]])

In [48]: stats.nanmean(C)
Warning: invalid value encountered in divide
Out[48]: array([ 4.5,  5. ,  8. ,  nan,  5. ,  6.5,  5.5,  2. ])

You can find other numpy-only (masked-array) solutions, here. Namely,

In [60]: C = np.array([A, B])    
In [61]: C = np.ma.masked_array(C, np.isnan(C))    
In [62]: C
Out[62]: 
masked_array(data =
 [[1.0 -- 8.0 -- -- 4.0 6.0 1.0]
 [8.0 5.0 8.0 -- 5.0 9.0 5.0 3.0]],
             mask =
 [[False  True False  True  True False False False]
 [False False False  True False False False False]],
       fill_value = 1e+20)

In [63]: np.mean(C, axis = 0)
Out[63]: 
masked_array(data = [4.5 5.0 8.0 -- 5.0 6.5 5.5 2.0],
             mask = [False False False  True False False False False],
       fill_value = 1e+20)

In [66]: np.ma.filled(np.mean(C, axis = 0), nan)
Out[67]: array([ 4.5,  5. ,  8. ,  nan,  5. ,  6.5,  5.5,  2. ])
Community
  • 1
  • 1
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 2
    An advantage of `np.ma` is that it works with integer arrays, while the `nan...` functions require float arrays as inputs. – Pierre GM Nov 08 '12 at 12:34
  • 1
    @PierreGM: Ah yes, because `np.nan`s are not allowed in integer arrays. Thanks for pointing this out. – unutbu Nov 08 '12 at 12:36
  • There's also [numpy.nanmean](http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.nanmean.html) if you don't have scipy. – rjf Apr 15 '14 at 22:24
1
  1. Starting from the lists like (you can also have None's instead of 0's)

    A = [1, 0, 8, 0, 0, 4, 6, 1]
    B = [8, 5, 8, 0, 5, 9, 5, 3]
    
  2. Then you should have a list like:

    lst = [A, B]
    
  3. Define a function to compute the mean of a list of numbers:

    def mean(nums):
        return float(sum(nums)) / len(nums) if nums else 0
    
  4. Finally you can compute the average in this way:

    C = [mean(filter(None, col)) for col in zip(*list)]
    
enrico.bacis
  • 30,497
  • 10
  • 86
  • 115