1

I am using facet_grid to draw several plots and I am wondering how to add some extra information as a caption in each individual plot.

I managed to add information in the title of each plot (in order to add the Kruskal-Wallis p-value), but I would like to add more info below each plot (as a caption).

Here is a reproducible example:

library(ggplot2)
library(dplyr)
set.seed(1234)
Gene <- floor(runif(25, min=0, max=101))
Age <- floor(runif(25, min=18, max=75))
Group <- c("Group1", "Group1", "Group3", "Group2", "Group1", "Group3", "Group2", "Group2", "Group2", "Group1", "Group1", "Group3", "Group1", "Group2", "Group1", "Group2", "Group3", "Group1", "Group3", "Group3", "Group2", "Group1", "Group3", "Group3","Group2")


df <- data.frame(Gene, Age, Group)
df$Group <- as.factor(df$Group)

mybreaks <- seq(min(df$Age)-1, to=max(df$Age)+10, by=10)
df$groups_age <- cut(df$Age, breaks = mybreaks, by=10)

bp <- ggplot(df, aes(x=groups_age, y=Gene, group=groups_age)) + 
  geom_boxplot(aes(fill=groups_age)) + 
  facet_grid(. ~ Group)

bp

pval <- df %>%
  group_by(Group) %>%
  summarize(Kruskal_pvalue = kruskal.test(Gene ~ groups_age)$p.value)

# This is to create new labels for the facetgrid where we can show the phenotype and the KW pvalue.
labels <- c(paste('Group 1\n KW p-val:', signif(subset(pval$Kruskal_pvalue, pval$Group=="Group1"), digits = 3)),
            paste('Group 2\n  KW p-val:', signif(subset(pval$Kruskal_pvalue, pval$Group=="Group2"), digits = 3)),
            paste('Group 3\n  KW p-val:', signif(subset(pval$Kruskal_pvalue, pval$Group=="Group3"), digits = 3)))

df$KW <- factor(df$Group, levels = levels(df$Group), labels = labels)


bp <- ggplot(df, aes(x=groups_age, y=Gene, group=groups_age)) + 
  geom_boxplot(aes(fill=groups_age)) + 
  facet_grid(. ~ KW) +
  theme(legend.position="none")
bp

This is the result of the code above: image 1

This is the only way that I could think of if I want to add info about each plot as a caption.

df_group1 <- df[df$Group == "Group1",]
df_group2 <- df[df$Group == "Group2",]
df_group3 <- df[df$Group == "Group3",]

myfunction <- function(DF){
  df <- as.data.frame(table(DF$groups_age))
  # This is to add  ": n = " to the first column
  df$Var1 <- paste(df$Var1, ": n = ", sep = "")
  # We join both columns in one to have the result together.
  df$X <- paste(df$Var1, df$Freq)
  # We save that column into a variable 
  vec <-  df[["X"]]
  return(vec)
}

numb_group1 <- myfunction(df_group1)
numb_group1 <- paste(numb_group1, collapse = "; ") 

numb_group2 <- myfunction(df_group2)
numb_group2 <- paste(numb_group2, collapse = "; ") 

numb_group3 <- myfunction(df_group3)
numb_group3 <- paste(numb_group3, collapse = "; ") 

numb_all <- c(numb_group1, numb_group2, numb_group3)


bp <- bp + labs(caption = paste0("Group 1: n = ", nrow(subset(df, df$Group=="Group1")), 
                                 "\n", 
                                 "           Groups: ", numb_all[1],
                                 "\n",
                                 "\n",
                                 "Group 2: n = ", nrow(subset(df, df$Group=="Group2")), 
                                 "\n",
                                 "           Groups: ", numb_all[2],
                                 "\n",
                                 "\n",
                                 "Group 3: n = ", nrow(subset(df, df$Group=="Group3")), 
                                 "\n",
                                 "           Groups:", numb_all[3]
)) +  theme(legend.position="none",
            plot.caption = element_text(hjust = 0, face= "italic")) #Default is hjust=1
bp

This is how it looks: image 2

However, I would like to improve my code and find another way (if it exists) to put each info below to each individual plot.

Does anyone have an idea of what it can be done?

Thanks very much in advance

emr2
  • 1,436
  • 7
  • 23
  • I don't have a multi-caption facet solution to hand, but if I didn't have to make too many plots, I would create separate plots with individual captions and add them together with the Patchwork library to give the illusion of a facetted plot (with added customisation at the cost of creating additional plots). – jpenzer Jan 25 '22 at 14:40

2 Answers2

2

Generally-speaking for plot captions on multi-faceted plots:

  • If you want a single caption which is below alll plots, you should use theme(plot.caption = ...).

  • If you want the same caption to appear below each facet, you can do this using annotate() and turn clipping off.

  • If you want to have different captions to appear below each facet, you will need something capable of being mapped to a dataset (so you can specify the different text per facet). In this case, I would recommend using geom_text() and doing a clever bit of formatting to fit in the caption.

  • An alternative to have different caption per plot would be create individual plots with captions and link them together via grid.arrange() or patchwork or cowPlot()...

Here's the example of the third case using geom_text() and mtcars. I hope you can apply this to your own dataset.

The basic plot

Here's the basic plot we'll use for adding a caption:

library(ggplot2)
p <- ggplot(mtcars, aes(qsec, mpg)) + geom_point() +
        facet_wrap(~cyl)

enter image description here

Caption Data frame

To make the caption plot, we first need to define the text per each facet. It's best to do this in a separate data frame from your bulk data. This ensures that there is not any overplotting of the text geom (drawing in the same place multiple times), since one text geom is drawn per observation in a data frame. Here's our dataframe for captions:

caption_df <- data.frame(
  cyl = c(4,6,8),
  txt = c("carb=4", "carb=6", "carb=8, OMG!")
)

Plotting with captions

To make the plot, we need to adjust a few things to our plot.

  • Add the caption. Add a geom_text() and map to caption_df. We'll map the text, but the position will be fixed in x and y. The x value is set to be the minimum of our original data, but we could set that manually too. The y value needs to be set a value that would place it below the original plot.

  • Confine the limits of the plot. Since we place our text geom below the original plot area, if we did not confine the limits of the plot area, ggplot2 would just expand the y limits to fit the new text. We need to keep the original y limits to ensure the y value of the geom_text() we add stays below this area.

  • Turn off clipping. In order to actually see the new captions, you need to turn off clipping. You can do this in any of the coord_*() functions, so we'll use coord_cartesian() to do this and set the y limits.

  • Increase lower margin. To ensure we see the caption in the final image, we need to increase the margin below the plot via theme(plot.margin=...).

Here's the final result of all that.

ggplot(mtcars, aes(qsec, mpg)) + geom_point() + facet_wrap(~cyl) +
  coord_cartesian(clip="off", ylim=c(10, 40)) +
  geom_text(
    data=caption_df, y=5, x=min(mtcars$qsec),
    mapping=aes(label=txt), hjust=0,
    fontface="italic", color="red"
  ) +
  theme(plot.margin = margin(b=25))

enter image description here

chemdork123
  • 12,369
  • 2
  • 16
  • 32
0

After having tried a lot of things with facet_grid and captions, I created some posts where I got really great answers that can help someone with this problem.

This is the main solution: https://stackoverflow.com/a/71557785/13997761

Although this created me some questions when I was trying to automatize the code: https://stackoverflow.com/a/71561745/13997761 and https://stackoverflow.com/a/71569950/13997761.

However, I realised that for this case it is better to put the number of observations above each boxplot. It is more visual and it is not necessary a lot of code.

myFreqs <- df %>%  
  group_by(Group, groups_age) %>% 
  summarise(Freq = n()) 
myFreqs 


bp + stat_summary(geom = 'text', label = paste("n = ", myFreqs$Freq), fun = max, vjust = -1, position = position_dodge(width=0.7))

enter image description here

emr2
  • 1,436
  • 7
  • 23