-1

I want to remove the outliers which are found by boxplot in my dataframe for each column. I know boxplot finds the outliers by IQR rule and displays them on graph. I know how to plot the boxplot using seaborn but I am unsure how can I determine exactly which rows these outliers actually refer to and how can I remove them ? Is there a function/method do to this ?

Rishabh Sharma
  • 145
  • 1
  • 7

1 Answers1

3

According to the basic definition of IQR outliers, Values less than Q1-1.5*IQR and values greater that Q3+1.5*IQR are treated as outliers. So,

Q1 =  df['col_name'].quantile(0.25)
Q3 = df['col_name'].quantile(0.75)
IQR = Q3 - Q1

Now, outliers are ,

df[(df['col_name'] < Q1-1.5*IQR ) | (df['col_name'] > Q3+1.5*IQR)]['col_name']
squaleLis
  • 6,116
  • 2
  • 22
  • 30