1

Does someone know how to prevent the "whiskers" in ggthemes::geom_tufteboxplot to be drawn up to the extreme values? I tried changing the outlier and whisker arguments to no avail.

library(ggplot2)
library(ggthemes)

ggplot(iris, aes(Species, Sepal.Length)) +
  geom_boxplot() 

Whisker extend to 1.5xIQR as usual:

ggplot(iris, aes(Species, Sepal.Length)) +
  geom_tufteboxplot()

"Whisker" extend to extreme value

Created on 2020-03-03 by the reprex package (v0.3.0)

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • To my opinion, your whiskers looks similar in both functions. In `geom_boxplot`, you can pass the argument `coef` to set the multiplier of the IQR. is it what you are looking for ? – dc37 Mar 03 '20 at 12:34
  • @dc37 `virginica`'s lower whisker extends to the outlier (the actual data extreme) and not the default minimum value (i.e. lower quartile - 1.5*IQR). I am very happy with the `geom_boxplot`s, so don't need to change that. – tjebo Mar 03 '20 at 13:31
  • Sorry my mistake, indeed it is different for virginica. – dc37 Mar 03 '20 at 14:41

1 Answers1

1

I was able to find a workable solution by changing the stat to "boxplot". Here's a reprex (the last example shows how to hide outliers, although the axis range will still consider them; the work-around for that is more involved):

library(ggplot2)
library(ggthemes)

ggplot(iris, aes(Species, Sepal.Length)) +
  geom_boxplot()

ggplot(iris, aes(Species, Sepal.Length)) +
  ggthemes::geom_tufteboxplot(stat = "boxplot")

ggplot(iris, aes(Species, Sepal.Length)) +
  ggthemes::geom_tufteboxplot(stat = "boxplot", outlier.shape = NA)

Created on 2020-05-29 by the reprex package (v0.3.0)

double-beep
  • 5,031
  • 17
  • 33
  • 41
Matt
  • 56
  • 4
  • That's great! Thanks. I don't see a flaw in the axis not adjusting to the outliers not shown - the data is still there! so that is not unintended behaviour. – tjebo May 29 '20 at 20:52
  • Yeah, now that I've played with it a bit more, merely hiding the outliers and not adjusting the axis range is the desired behavior. In my case, I had extreme outliers (akin to, for example, setting some sepal lengths in the iris data set to values in the 1000s) that I wanted to work around, and solved by manually calculating the values to use for the box plots to make sure values outside of 1.5 * IQR were ignored when plotting. – Matt May 30 '20 at 20:45