0

In a list, I d'like to reject some values because they deviates to much from the median : 0.4877, 0.5113 and 1.5103

I did the following code and its seems to work but I d'like to know if this is the right way to do it ? Especially data_std = 0.30988 seems to be a lot. Shoud I use the squared root ?

import numpy

def out_of_range(d, min, max):
    if d < min or d > max:
        return "    " #rejected
    return ""

data = list([0.1410,
            0.1437,
            0.1371,
            0.1318,
            0.4877,
            0.5113,
            1.5103,
            0.1388,
            0.1398,
            0.1384,
            0.1406,
            0.1383,
            0.1458,
            0.1410,
            0.1423,
            0.1372,
            0.1386,
            0.1343,
            0.1397,
            0.1413])

data_mean   = numpy.mean(data)
data_std    = numpy.std(data)
data_median = numpy.median(data)

min_range = data_median - data_std
max_range = data_median + data_std

print "Mean      : " + str(data_mean)
print "Median    : " + str(data_median)
print "std dev   : " + str(data_std)
print "min_range : " + str(min_range)
print "max_range : " + str(max_range)
print ""
print "---data---"
for d in data:
    print out_of_range(d, min_range, max_range) +str(d)

Returns

Mean      : 0.24395
Median    : 0.1402
std dev   : 0.309883094892
min_range : -0.169683094892
max_range : 0.450083094892

---data---
0.141
0.1437
0.1371
0.1318
    0.4877
    0.5113
    1.5103
0.1388
0.1398
0.1384
0.1406
0.1383
0.1458
0.141
0.1423
0.1372
0.1386
0.1343
0.1397
0.1413
snoob dogg
  • 2,491
  • 3
  • 31
  • 54

1 Answers1

0

The number is correct.

enter image description here

(sum( (i-np.mean(data))**2 for i in data )/len(data))**0.5
0.30988309489225124

np.std(data)
0.30988309489225124
galaxyan
  • 5,944
  • 2
  • 19
  • 43
  • indeed, by "seems to be a lot" I was talking about the computation of the min and max range, not challenging Numpy ^^ . Sorry that wasn't clear – snoob dogg Apr 07 '17 at 17:20