1

I have build a boxplot with ggplot and want to display the actual values of the 1st quartile, median and 3rd quartile in the boxplot.

Since the boxplot already shows the 1st quartile, median and 3rd wuartile, I would assume there is a simple function to dispaly the values in the boxplot itself. These are my current codes:

CAGR <- ggplot(datalong5, aes(x = Year, y = value, fill = new)) +
  geom_boxplot(outlier.shape = NA, coef = 0) +
  coord_cartesian(ylim = c(-0.1, 0.2)) +
  scale_fill_discrete(name = "") +
  theme(axis.title.x = element_blank()) +
  theme(axis.title.y = element_blank())

CAGR +
  scale_y_continuous(labels = percent, breaks = seq(-0.1, 0.3, by = 0.05)) +
  scale_fill_discrete(name = "Forklaring") +
  theme(legend.position = c(0.85, 0.15)) +
  theme(axis.text = element_text(size = 12, colour = "black"))

Picture of my boxplot

I have tried the stat_summary function(fun = median), but that does not seem to work for me since it only displays the median as a "point", but I want the value itself. Thanks in advance.

Santiago
  • 641
  • 3
  • 14
Chris_R
  • 11
  • 2
  • related https://stackoverflow.com/questions/19876505/boxplot-show-the-value-of-mean and https://stackoverflow.com/questions/34977882/get-quantile-values-from-geom-boxplot – tjebo Apr 14 '23 at 06:35
  • Chris would you care modifying your question with reproducible data ? Check Santiagos answer for an example - you could use exactly the same in this case (the iris example). t would also be great if you could show your desired output, e.g. on a sketch – tjebo Apr 14 '23 at 06:59

2 Answers2

1

I'd first find the points you want to include using quantile and then I'd use geom_text to annotate the plot.

As I don't have your data here's a minimal example using iris:

library(ggplot2)
library(dplyr)

quartiles <- iris |> 
  group_by(Species) |> 
  reframe(y = quantile(Sepal.Width, c(.25, .5, .75)))

ggplot(iris, aes(x = Species, y = Sepal.Width)) +
  geom_boxplot(outlier.shape = NA, coef = 0) +
  geom_text(
    data = quartiles,
    aes(
      x = Species,
      y = y,
      label = y
    ),
    nudge_x = .4,
    hjust = 0
  )

Output:

output

Consult the documentation of geom_text for more formatting and alignment options. You may also want to check out geom_label.

Santiago
  • 641
  • 3
  • 14
1

Another option is to use stat_summary. This is just to show the possibility. In my experience, it can be often frustrating to get this working as it requires some (for me: often too) virtuoso handling of after_stat, stage and the likes, and the approach to calculate any aggregate measures outside of ggplot (e.g., user Santiago's answer) are much easier and less error prone.

library(ggplot2)

ggplot(iris, aes(x = Species, y = Sepal.Width)) +
  geom_boxplot(outlier.shape = NA, coef = 0) +
  stat_summary(aes(y = stage(Sepal.Width, after_stat = quarts), 
    label = after_stat(quarts)),
    geom = "text",
    fun.data = ~ data.frame(quarts = quantile(.x, probs = c(.25, .75)))
  )

Created on 2023-04-14 with reprex v2.0.2

tjebo
  • 21,977
  • 7
  • 58
  • 94