Answer:
I assume that your problem is the following: First, you detect outliers (just like the boxplot function) and remove them. Afterwards, you produce boxplots with the cleaned data, which again shows outliers. And you expect to see no outliers.
This is not necessarily an error of your code, this is an error in your expectations. When you remove the outliers, the statistics of your data set change. For example, the quartiles are not the same anymore. Hence, you might identify "new" outliers. See the following example:
## create example data
set.seed(12345)
rand <- rexp(100,23)
## plot. gives outliers.
boxplot(rand)
## detect outliers with these functions
detectaOutliers = function(x) {
q = quantile(x, probs = c(0.25, 0.75))
R = IQR(x)
OM1 = q[1] - (R * 1.5) # outliers moderados
OM3 = q[2] + (R * 1.5)
OE1 = q[1] - (R * 3) # outliers extremos
OE3 = q[2] + (R * 3)
moderados = ifelse(x < OM1 | x > OM3, 1, 0)
extremos = ifelse(x < OE1 | x > OE3, 1, 0)
cbind(extOut = moderados)
}
detectOut <- function(x) boxplot(x, plot = FALSE)$out
## clean your data
clean1 <- rand[!as.logical(detectaOutliers(rand))]
clean2 <- rand[!rand%in%detectOut(rand)]
## check that these functions do the same.
all(clean1 == clean2 )
# Fun fact: depending on your data, clean1 and clean2
# are not always the same. See the extra note below.
## plot cleaned data
boxplot(clean2)
## Still has outliers. But "new" ones. confirm with:
sort(boxplot(rand)$out) # original outlier
sort(boxplot(clean2)$out) # new outlier
Note 1:
Your code does not necessarily use the same outlier identification as the boxplot function in R (I am not sure about the ggplot boxplot, but this is at least true for the graphics::boxplot function.):
## The boxplot function (rather: boxplot.stats)
## does not use the quantile function, but the fivenum function
## to identify outliers. They produce different results, e.g., here:
fivenum(rand)[c(2,4)]
quantile(rand,probs=c(0.25,0.75))
Note 2:
If you want boxplots that exclude outliers, you can use the outline
parameter of the boxplot function (for ggplot, see Ignore outliers in ggplot2 boxplot)