2

My question is as follows: in R package ggplot2 - boxplots - how to mark the two points at the end of the whiskers (the upper and the lower) e.g with a "x" mark so ending up with a boxplot and two additional marks of "x" at the very upper end of the whisker and the other one would be at the very lower end of the lower whisker.

I have searched a lot in the internet for an answer but couldn't find. I could only add "x" mark on the boxplot by using the stat_summary and using mean function data.

How to do the other two points?

To be on the same page please use the mtcars database of R and make boxplot of mpg as y axis and cyl as x axis. Yu will end up with 3 boxplots according to the dataframe mtcars.

According to R

The upper end defined as Q3+1.5*IQR
The lower end defined as Q1-1.5*IQR
Note: IQR = Q3 - Q1
Andrie
  • 176,377
  • 47
  • 447
  • 496
doctorate
  • 1,381
  • 1
  • 19
  • 43
  • You are more likely to get an answer if you show what you have done. Post your example, using `mtcars` of creating an x at the mean. (And once you've done that, you may realize that you can plot an x at the `min` and `max` of your dataset.) – Andrie Mar 23 '12 at 17:48

1 Answers1

2

You just need to calculate the end points of the boxplots and add them, using stat_summary. For example

##Load the library
library(ggplot2)
data(mpg)

##Create a function to calculate the points
##Probably a built-in function that does this
get_tails = function(x) {
  q1 = quantile(x)[2]
  q3 = quantile(x)[4]
  iqr = q3 -q1
  upper = q3+1.5*iqr
  lower = q1-1.5*iqr
  if(length(x) == 1){return(x)} # will deal with abnormal marks at the periphery of the plot if there is one value only
  ##Trim upper and lower
  up = max(x[x < upper])
  lo = min(x[x > lower])
  return(c(lo, up))
}

Use stat_summary to add it to your plot:

ggplot(mpg, aes(x=drv,y=hwy)) + geom_boxplot() + 
  stat_summary(geom="point", fun.y= get_tails, colour="Red")

Also, your definition of the end points isn't quite correct. See my answer to another question for a few more details.

Community
  • 1
  • 1
csgillespie
  • 59,189
  • 14
  • 150
  • 185
  • thanks a lot that's exactly what i am looking for but can you please do me another favor. How to enter this nice written function into Deducer package. There is points in under geometric elements then one can choose summary then options then custom, here I need your help to enter the function get_tails. Thanks in advance – doctorate Mar 23 '12 at 19:59
  • Sorry, I've never used the Deducer package – csgillespie Mar 25 '12 at 09:44
  • Hi csgillespie, I have tried the function get_tails and worked fine but there is one flaw in this function: if it happens that I have only one data value in one variable and I want to represent that as boxplot it should appear as a line that is everything BUT the marks of both tails will appear away from both ends of the boxplot at the periphery of the plot (the boxplot will be one small line in this case, because one value is there) so how to solve that in the function to stop it from showing up in the plot if there is such an occasion or do you have other suggestions/solutions? – doctorate Mar 28 '12 at 23:46
  • 1
    Add a line near the top of the function, along the lines of: `if(length(x) == 1){return(x)}` – csgillespie Mar 29 '12 at 07:23
  • Thanks a lot csgillespie for answering my question. Can you take a look at the other question of referenced to median. regards. – doctorate Jul 19 '12 at 18:16