I am working on numpy and I have a number of arrays with the same size and shape like:
a= [153 186 0 258]
b=[156 136 156 0]
c=[193 150 950 757]
I want to have average of the arrays, but I want the program to ignore the zero values in the computation. So, the resulting array for this example will be: d=[167.333 157.333 553 507.5]
this is the result of this computation: d=[(153+156+193)/3 (186+136+150)/3 (156+950)/2 (258+757)/2]
. Is it possible to do that?
Asked
Active
Viewed 4,750 times
4

John Slade
- 12,246
- 2
- 25
- 20

f.ashouri
- 5,409
- 13
- 44
- 52
1 Answers
14
In Python:
>>> a = [153, 186, 0, 258]
>>> b = [156, 136, 156, 0]
>>> c = [193, 150, 950, 757]
>>> import statistics
>>> [statistics.mean([x for x in s if x]) for s in zip(*[a, b, c])]
[167.33333333333334, 157.33333333333334, 553, 507.5]
In numpy:
>>> import numpy as np
>>> A = np.vstack([a,b,c])
>>> np.average(A, axis=0, weights=A.astype(bool))
array([ 167.33333333, 157.33333333, 553. , 507.5 ])
If there is a possibility that all values in a column can equal zero, you may want to use masked arrays to avoid the problem that the normalization is impossible (weights can't sum to zero). Undefined slots in output will be masked.
>>> a[0] = b[0] = c[0] = 0
>>> A = np.vstack([a,b,c])
>>> np.ma.average(A, axis=0, weights=A.astype(bool))
masked_array(data=[--, 157.33333333333334, 553.0, 507.5],
mask=[ True, False, False, False],
fill_value=1e+20)
>>> np.ma.average(A, axis=0, weights=A.astype(bool)).tolist()
[None, 157.33333333333334, 553.0, 507.5]

wim
- 338,267
- 99
- 616
- 750
-
My arrays are 1200*1200 and I tried to simplify it in the question. It seems that it doesn't work for arrays that have more than one row. How can I do this? – f.ashouri Nov 08 '12 at 10:43
-
Assuming you want your output to be (1200, 1200) shape aswell, use `np.dstack` instead and average along the depth axis. If you want the output shape to be (1200,) then I can see no reason why vstack wouldn't still work.. – wim Nov 08 '12 at 11:26
-
1Great answer! I used `np.ma.average()` as a robust solution to handle cases where the sum of all values is zero. – Jason Bellino Nov 08 '13 at 15:25