0

I am trying to make a boxplot where the widths of the boxes are proportional to their numeric sizes. I already have the summary statistics calculated for the categorical X variable:(n, p5, p25, p50, p75, p95). I have done some research and tried using the "n" variable as a set of weights, as well as using the "varwidth=TRUE" argument in geom_boxplot. Neither of these have worked for me. Is there a way to add proportional widths when you have already calculated the summary statistics of the data? Here is an example of what my code looks like:

src <- c("a", "b", "c")
n <- c(10, 20, 30)
p5 <- c(-10, -20, -15)
p25 <- c(-5, -10, -10)
p50 <- c(5, 0, 5)
p75 <- c(10, 5, 15)
p95 <- c(15, 20, 30)

df <- data.frame(src, n, p5, p25, p50, p75, p95)

MonthsSrc <- ggplot(data=df, aes(x=factor(src), ymin=p5, lower=p25, middle=p50, upper=p75, ymax=p95)) +
  geom_boxplot(stat="identity", fill="white", colour="black", varwidth=TRUE) +
  scale_y_continuous(name="Value(%)", limits=c(-30,30), breaks=c(-30,-15,0,15,30), labels=c(-30,-15,0,15,30)) +
  theme_classic() +
  theme(axis.line.x=element_line(color="black",size=0.5),
        axis.line.y=element_line(color="black",size=0.5))

Thanks for the help!

John M
  • 13
  • 6
  • The question could be improved. Please hover over the R tag - it asks for a minimal reproducible example. [Here's a guide](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610); also look at the R help files (e.g. `?geom_boxplot`, _examples_ section) – lukeA Jan 04 '17 at 19:55
  • I have added an example data set. – John M Jan 04 '17 at 20:28

1 Answers1

1

One (hacky) way of doing this would be to use varwidth=T which means that the width is proportional to sqrt(n). We could then use the weight aesthetic to bring it to n with weight=sqrt(n) since sqrt(n)*sqrt(n)=n.

Example with the mpg dataset from ggplot2:

library(dplyr)
library(ggplot2)
mpg%>%
  group_by(manufacturer)%>%
  mutate(n=n())%>%
  ggplot(aes(x=manufacturer,y=displ))+
  geom_boxplot(aes(weight=sqrt(n)),varwidth = T)

There might be a better solution.

Haboryme
  • 4,611
  • 2
  • 18
  • 21