0

I have this boxplot with outliers, i need to plot the number of the line that contain the outlier observation, to make it easy to go in the data set and find where the value, somebody can help me?

set.seed(1)
a <- runif(10,1,100)
b <-c("A","A","A","A","A","B","B","B","B","B")
t <- cbind(a,b)
bp <- boxplot(a~b)
text(x = 1, y = bp$stats[,1] + 2, labels = round(bp$stats[,1], 2))
text(x = 2, y = bp$stats[,2] + 2, labels = round(bp$stats[,2], 2))

How

rawr
  • 20,481
  • 4
  • 44
  • 78
  • 2
    In order that we can help you, please provide example data and the steps you've tried so far. Consider [*How to make a great reproducible example*](https://stackoverflow.com/a/5963610/6574038), thank you. – jay.sf Mar 06 '18 at 19:57
  • Wouldn't the easiest way to just subset for values that are B and above some amount read from the graph? Like `df[category == "A" & value < 20, ]`, `df[category == "B" & value > 40, ]` – Calum You Mar 06 '18 at 19:59
  • @jaySf, my code: a <- runif(10,1,100) b <-c("A","A","A","A","A","B","B","B","B","B") t <- cbind(a,b) bp<- boxplot(a~b) text(x = 1, y = bp$stats[,1] + 2, labels = round(bp$stats[,1], 2)) text(x = 2, y = bp$stats[,2] + 2, labels = round(bp$stats[,2], 2)) – Máiron Chaves Mar 06 '18 at 20:11
  • 2
    it seems like you know how to use the return of boxplot to make that figure `bp <- boxplot(mpg ~ vs, mtcars); text(col(bp$stats), bp$stats, bp$stats, pos = 3)` but not that boxplot also returns the outliers `mtcars[which(mtcars$mpg %in% bp$out), ]` – rawr Mar 06 '18 at 20:12
  • Run `bp[]` or `names(bp)` to investigate content. – Brian Davis Mar 06 '18 at 22:53

1 Answers1

0

What is the point of t <- cbind(a, b)? That makes a character matrix and converts your numbers to character strings? You don't use it anyway. If you want a single data structure use data.frame(a, b) which will make a a factor and leave b numeric. I do not get the plot you do with set.seed(1) so I'll provide slightly different data. Note the use of the pos= and offset= arguments in text(). Be sure to read the manual page to see what they are doing:

a <- c(99.19, 59.48, 48.95, 18.17, 75.73, 45.94, 51.61, 21.55, 37.41, 
59.98, 57.91, 35.54, 4.52, 64.64, 75.03, 60.21, 56.53, 53.08, 
98.52, 51.26)
b <- c("A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "B", 
"B", "B", "B", "B", "B", "B", "B", "B")
bp <- boxplot(a~b)
text(x = 1, y = bp$stats[,1], labels = round(bp$stats[, 1], 2), 
     pos=c(1, 3, 3, 1, 3), offset=.2)
text(x = 2, y = bp$stats[, 2], labels = round(bp$stats[, 2], 2), 
     pos=c(1, 3, 3, 1, 3), offset=.2)
obs <- which(a %in% bp$out)
text(bp$group, bp$out, obs, pos=4)

enter image description here

dcarlson
  • 10,936
  • 2
  • 15
  • 18