align geom_text with geom_boxplot in ggplot2

Question

Suppose I have the following two data sets:

behaviorm <- structure(list(sentential_connective = c("IF", "IF", "IF", "IF", 
"IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", 
"IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", 
"IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", "IF", 
"IF", "IF", "IF"), mentioned_object = c("Same", "Same", "Same", 
"Same", "Same", "Same", "Same", "Same", "Same", "Same", "Same", 
"Same", "Same", "Same", "Same", "Same", "Same", "Same", "Same", 
"Same", "Same", "Same", "Same", "Same", "Same", "Same", "Same", 
"Same", "Same", "Same", "Same", "Same", "Same", "Same", "Same", 
"Same", "Same", "Same", "Same", "Same"), agent_mood = c("Sad", 
"Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", 
"Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", 
"Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", 
"Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", "Sad", 
"Sad", "Sad", "Sad"), Chosen_Box = c("SD", "SD", "SD", "SD", 
"SD", "SD", "SD", "SD", "SD", "SD", "CS", "CS", "CS", "CS", "CS", 
"CS", "CS", "CS", "CS", "CS", "SS", "SS", "SS", "SS", "SS", "SS", 
"SS", "SS", "SS", "SS", "DD", "DD", "DD", "DD", "DD", "DD", "DD", 
"DD", "DD", "DD"), participant = c("a01", "a02", "a03", "a04", 
"a05", "a06", "a07", "a08", "a09", "a10", "a01", "a02", "a03", 
"a04", "a05", "a06", "a07", "a08", "a09", "a10", "a01", "a02", 
"a03", "a04", "a05", "a06", "a07", "a08", "a09", "a10", "a01", 
"a02", "a03", "a04", "a05", "a06", "a07", "a08", "a09", "a10"
), Counts = c(12L, 8L, 12L, 6L, 3L, 12L, 9L, 12L, 12L, 11L, 0L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
0L, 0L, 0L, 0L, 4L, 0L, 4L, 9L, 0L, 2L, 0L, 0L, 0L)), row.names = c(1L, 
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 73L, 74L, 75L, 76L, 77L, 
78L, 79L, 80L, 81L, 82L, 145L, 146L, 147L, 148L, 149L, 150L, 
151L, 152L, 153L, 154L, 217L, 218L, 219L, 220L, 221L, 222L, 223L, 
224L, 225L, 226L), class = "data.frame")


res <- structure(list(sentential_connective = c("IF", "IF", "IF", "IF"
), mentioned_object = c("Same", "Same", "Same", "Same"), agent_mood = c("Sad", 
"Sad", "Sad", "Sad"), Chosen_Box = c("SD", "CS", "SS", "DD"), 
    statistic = c(54, 0, 0, 8), p.value = c(0.00362357852661936, 
    0.999052845531107, 0.999052845531107, 0.937942586194492), 
    Sig = c("*", "", "", ""), Counts = c(12L, 1L, 1L, 9L)), class = "data.frame", row.names = c(1L, 
73L, 145L, 217L))

And the following code I used to do the plotting:


library(ggplot2)
b <- ggplot()
b <- b + aes(x = sentential_connective, y = Counts)
b <- b + geom_boxplot(aes(color = Chosen_Box), data = behaviorm, outlier.color = NA)
b <- b + geom_text(   aes(label = Sig),        data = res, position = position_dodge2(width = 0.9), size = 5)
b

In the resulting graph, however, the label from the res file is not correctly aligned with the boxplot from the behaviorm file.

I've tried the suggestions explained in the explanation and did not succeed.

Any suggestions? Thanks.

dc37 · Accepted Answer · 2019-12-11T07:20:58.410

Because your sentential_connective is a single factor IF, instead of using x = sentential_connective, you should use x = Chosen_Box to get boxplot aligned with their group and re-use this x to plot the label from the dataset res.

So, a code like that should work

library(ggplot2)
ggplot(behaviorm, aes(x = Chosen_Box, y = Counts))+
  geom_boxplot(aes(color = Chosen_Box), outlier.color = NA) +
  geom_text(data = res, aes(x = Chosen_Box, label = Sig, y = 13), size = 10)+
  ylim(0,14)

EDIT: Adding geom_text for multiple variables

Based on your comments, you have at least two groups defined in sentential_connective and you would like to be able to keep this as the x parameter.

So, to do it, I first duplicate your data in order to have two factor levels for sentential_connective in both datasets:

behavior2 = behaviorm
behavior2$sentential_connective <- "IF2"
behav = rbind(behaviorm, behavior2)

res2 = res
res2$sentential_connective = "IF2"
RES = rbind(res2,res)

And for the plot, you were close to the solution by using position_dodge(), except that you have to use it also for geom_boxplot.

ggplot(behav, aes(x = sentential_connective, y = Counts, color = Chosen_Box))+
  geom_boxplot(position=position_dodge(width=0.8), outlier = NA)+
  geom_text(data = RES, aes( x = sentential_connective, y =13, label = Sig, group = Chosen_Box), size = 10, color = "black", position=position_dodge(width=0.8))+
  ylim(0,14)

Is it what you are looking for ?

Thank you very much. But the data I gave is a simplified one of my original data, where the variable `sentential_connective` contains two levels. So it would be better to keep the information of sentential connectives in the graph. Thanks. — Likan Zhan, Dec 11 '19 at 07:06
Got it ! I edited my answer, it should solve your issue. Let me know if it is what you were looking for. — dc37, Dec 11 '19 at 07:21
Thank you very much, it seems that `group = Chosen_Box` is crucial for the result — Likan Zhan, Dec 11 '19 at 11:05
Yes it is ! but the `position_dodge()` in both `geom_boxplot` and `geom_text` is really important too. Glad to be able to help you ! — dc37, Dec 11 '19 at 14:29

align geom_text with geom_boxplot in ggplot2

1 Answers1