I am trying out to find out outliers in a dataset which I have created to understand the topic by myself. Its a simple python list. But I am not able to get the desired outcome. I am using google collab. I am using the concept that in a normal distribution, after the 3rd standard deviation mostly the outliers exists.
The code is given below:
df2=[12,13,14,15,10,12,14,15,1007,12,14,17,18,1005,14,15,16,17,13,14,1100,12,13,14,15]
outliers=[]
def detect_outliers(data):
threshold = 3 ## threshold is till 3rd standard deviation
mean = np.mean(data)
standard_deviation = np.std(data)
for i in data:
z_score = (i-mean)/standard_deviation
if np.abs(z_score)>threshold:
outliers.append(i)
return outliers
detect_outliers(df2)
I am getting the output in the form of an empty list. []