average of a number of arrays with numpy without considering zero values

Question

I am working on numpy and I have a number of arrays with the same size and shape like: a= [153 186 0 258] b=[156 136 156 0] c=[193 150 950 757] I want to have average of the arrays, but I want the program to ignore the zero values in the computation. So, the resulting array for this example will be: d=[167.333 157.333 553 507.5] this is the result of this computation: d=[(153+156+193)/3 (186+136+150)/3 (156+950)/2 (258+757)/2]. Is it possible to do that?

wim · Accepted Answer · 2020-09-13T15:41:53.093

14

In Python:

>>> a = [153, 186, 0, 258]
>>> b = [156, 136, 156, 0]
>>> c = [193, 150, 950, 757]
>>> import statistics
>>> [statistics.mean([x for x in s if x]) for s in zip(*[a, b, c])]
[167.33333333333334, 157.33333333333334, 553, 507.5]

In numpy:

>>> import numpy as np
>>> A = np.vstack([a,b,c])
>>> np.average(A, axis=0, weights=A.astype(bool))
array([ 167.33333333,  157.33333333,  553.        ,  507.5       ])

If there is a possibility that all values in a column can equal zero, you may want to use masked arrays to avoid the problem that the normalization is impossible (weights can't sum to zero). Undefined slots in output will be masked.

>>> a[0] = b[0] = c[0] = 0
>>> A = np.vstack([a,b,c])
>>> np.ma.average(A, axis=0, weights=A.astype(bool))
masked_array(data=[--, 157.33333333333334, 553.0, 507.5],
             mask=[ True, False, False, False],
             fill_value=1e+20)
>>> np.ma.average(A, axis=0, weights=A.astype(bool)).tolist()
[None, 157.33333333333334, 553.0, 507.5]

edited Sep 13 '20 at 15:41

answered Nov 08 '12 at 03:31

wim

338,267
99
616
750

My arrays are 1200*1200 and I tried to simplify it in the question. It seems that it doesn't work for arrays that have more than one row. How can I do this? – f.ashouri Nov 08 '12 at 10:43
Assuming you want your output to be (1200, 1200) shape aswell, use `np.dstack` instead and average along the depth axis. If you want the output shape to be (1200,) then I can see no reason why vstack wouldn't still work.. – wim Nov 08 '12 at 11:26
1

Great answer! I used `np.ma.average()` as a robust solution to handle cases where the sum of all values is zero. – Jason Bellino Nov 08 '13 at 15:25

average of a number of arrays with numpy without considering zero values

1 Answers1

Linked

Related