Questions tagged [iqr]

IQR stands for "Interquartile range" in statistics.

Interquartile range (statistics) equals to the difference between the third and first quartiles. A really nice alternative to show dispersion instead of standard deviation.

This descriptive statistic could be familiar from boxplots.

75 questions
49
votes
6 answers

how to use pandas filter with IQR

Is there a built-in way to do filtering on a column by IQR(i.e. values between Q1-1.5IQR and Q3+1.5IQR)? also, any other possible generalized filtering in pandas suggested will be appreciated.
Qijun Liu
  • 1,685
  • 1
  • 13
  • 11
9
votes
3 answers

How to Remove outlier from DataFrame using IQR?

I Have Dataframe with a lot of columns (Around 100 feature), I want to apply the interquartile method and wanted to remove the outlier from the data frame. I am using this link stackOverflow But the problem is nan of the above method is working…
Imran Ahmad Ghazali
  • 605
  • 1
  • 10
  • 16
6
votes
1 answer

Violin plot: How is the adjacent value range determined, and why is it different from boxplot?

In theory the violinplot of vioplot package is a boxplot + density function. In the "boxplot part", the black box corresponds to the IQR (indeed, see below), and the midline should correspond to the same range (adjacent values, default 1.5 IQR),…
bud.dugong
  • 689
  • 7
  • 16
4
votes
2 answers

Finding IQR of groups of rows

I am wanting to find the IQR of a range of values in a dataframe. These values are also grouped, therefore I need to find the IQR of each group in the dataframe. I have the following table: Block DNAname Spot_Size Molarity Cy3_Fluorescence 1…
MRF
  • 377
  • 1
  • 4
  • 15
3
votes
2 answers

Outlier rules in JFreeChart Boxplots?

i've got some questions regarding outlier rules in JFreeChart: Is it possible to influence the outlier rules in a JFreeChart Boxplot? I would assume that the default setting for outliers is Q3+1.5*IQR and Q1-1.5*IQR? Is there a default rule for…
dennis
  • 683
  • 2
  • 5
  • 18
3
votes
2 answers

Combining across and filter in groups

I'd like to filter just only x1,x2, and x3 values with the distance between the 5th and 95th quantiles by groups (id). But I don't have success in combining across with my variables (x1,x2, and x3), in my example: library(dplyr) data <-…
Leprechault
  • 1,531
  • 12
  • 28
2
votes
1 answer

Welford's online variance algorithm, but for Interquartile Range?

Short Version Welford's Online Algorithm lets you keep a running value for variance - meaning you don't have to keep all the values (e.g. in a memory constraned system). Is there something similar for Interquartile Range (IQR)? An online algorithm…
Ian Boyd
  • 246,734
  • 253
  • 869
  • 1,219
2
votes
1 answer

Python boxplot size of the IQR from 50% to 70%

I would like to know if it's possible to put 70% of the population in the boxplot as in the red one? I know that Q3 - Q1 = IQR but don't know how this can help me. I'm using matplotlib to draw my boxplot. def…
2
votes
1 answer

Apache Zeppelin Not Showing Full Stack Trace

I have the following Paragraph that does some Outlier detection using the InterQuartileRange method and strangely it runs in an error, but Apache Zeppelin is kind of truncating it to be useful. Here is the code: def interQuartileRangeFiltering(df:…
joesan
  • 13,963
  • 27
  • 95
  • 232
2
votes
1 answer

how to delete single values based on IQR filtering from dataframe

I have a dataframe with around 80 columns and a few hundreds of rows, below is an example dataframe. I need to filter the dataframe based on the IQR value then delete the outliers but not the whole row, only the actual value/cell. As far I could…
2
votes
2 answers

Plotly: How to change length of whiskers (min/max) in a boxplot?

I know that 1.5 * IQR is a common rule, but I would like to plot other min/max if possible. I am using plotly (python). Basically, I would like to define a function to show the boxplot by the parameters data frame, column, and a self-defined…
Len
  • 23
  • 1
  • 4
2
votes
5 answers

Using Numpy, how 25 percentile is calculate for number 1 to10?

from numpy import percentile import numpy as np data=np.array([1,2,3,4,5,6,7,8,9,10]) # calculate quartiles quartile_1 = percentile(data, 25) quartile_3 =percentile(data, 75) # calculate min/max print(quartile_1) # show 3.25 print(quartile_3) #…
2
votes
1 answer

reldist::wtd.iqr gives different result from IQR for equal weights

I've been getting unexpected results using the wtd.iqr function from the reldist package (version 1.6.6) to calculate a weighted interquartile range (as opposed to the unweighted interquartile range returned by IQR from the vanilla R stats package).…
Westcroft_to_Apse
  • 1,503
  • 4
  • 20
  • 29
2
votes
1 answer

How to find outliers in data with discrete variables in R

I'm beginning to learn R and data science in general. I have a data frame and most of my variables and the class I want to predict are discrete. What I need to do is find outliers in this data so I can deal with them by imputation or whatever. Some…
Renato Borges
  • 1,043
  • 9
  • 12
1
vote
1 answer

groupby operation in pandas.DataFrame without outliers

For a pandas.Series, I know how to remove outliers. With something like this: x = pd.Series(np.random.normal(size=1000)) iqr = x.quantile(.75) - x.quantile(.25) y = x[x.between(x.quantile(.25) - 1.5*iqr, x.quantile(.75) + 1.5*iqr)] I would like to…
phollox
  • 323
  • 3
  • 13
1
2 3 4 5