plot the mean on barplot without overlaping geom_text

Question

Very simple question. I'm trying to add the means to each variable on the barplot below. Problem is: I'm not able to do that, whenever I try I get the single value for mean(varUnlist) or a bunch of duplicated values by row. By the way, are the errors correct? them seen to be the same for all types ?

plot:

code:

myData %>% 
  ggplot(aes(varUnlist, type, fill = type)) +      # ggplot2 barplot with error bars
  coord_flip() +
  geom_bar(stat = "summary", fun.data = mean_se,  width = 0.2) +    
  stat_summary(geom = "errorbar", fun.data = mean_se, position = "dodge", width = 0.2) + 
  scale_fill_brewer(palette = "Blues", name = 'type:') +
  geom_text(label = str_glue('{mean(myData$varUnlist)}'), check_overlap = TRUE) +
  labs(x = "x",
       y = "median of Y",
       title = str_glue("effect of X on Y")) +
  theme_bw() +
  theme(legend.position = "right",
        legend.background = element_rect(color = "black"),  #bloco com legendas
        legend.margin = margin(5, 5, 3, 3),
        axis.text.x = element_text(angle = 0, hjust = 0.5, face = "bold"), # legenda de baixo: axis.text.x = element_text(angle = 0, hjust = 0.5, face = "bold"), axis.text.x = element_blank(),
        plot.title = element_text(hjust = 0.5, face = "bold"),
        axis.text.y = element_text(face = "bold"))

data:

> dput(myData)
structure(list(varUnlist = c(Outcome1 = 6.3, Outcome2 = 5.4, 
Outcome3 = 7.6, Outcome4 = 8.6, Outcome5 = 8.1, Outcome6 = 8.2, 
Outcome7 = 8.3, Outcome8 = 9.1, Outcome9 = 8.7, Outcome10 = 6.8, 
Outcome11 = 5.7, Outcome12 = 7, Outcome13 = 8.5, Outcome14 = 7.6, 
Outcome15 = 8.2, Outcome16 = 8.3, Outcome17 = 9.1, Outcome18 = 7.8, 
Outcome19 = 7.8, Outcome20 = 7.4, Outcome21 = 8.3, Outcome22 = 9.1, 
Outcome23 = 8.4, Outcome24 = 8.8, Outcome25 = 5.6, Outcome26 = 8.2, 
Outcome27 = 7.3, Outcome28 = 9.2, Outcome29 = 9.2, Outcome30 = 8.8, 
Outcome31 = 8, Outcome32 = 8.3, Outcome33 = 7.8, Outcome34 = 7.8, 
Outcome35 = 6.4, Outcome36 = 5.6, Outcome37 = 8, Outcome38 = 7.2, 
Outcome39 = 6.8, Outcome40 = 8.2, Outcome41 = 8, Outcome42 = 9.1, 
Outcome43 = 5.3, Outcome44 = 6.6, Outcome45 = 7.4, Outcome46 = 4.6, 
Outcome47 = 5.1, Outcome48 = 5.8, Outcome49 = 7.6, Outcome50 = 6.7, 
Outcome51 = 7.2, Outcome52 = 6.8, Outcome53 = 6, Outcome54 = 5.8, 
Outcome55 = 8.1, Outcome56 = 8.7, Outcome57 = 8.3, Outcome58 = 7, 
Outcome59 = 7.4, Outcome60 = 7.6, Outcome61 = 6.2, Outcome62 = 5.7, 
Outcome63 = 7.1, Outcome64 = 6.6, Outcome65 = 7.5, Outcome66 = 7.1, 
Outcome67 = 7.7, Outcome68 = 8.3, Outcome69 = 7.2, Outcome70 = 8, 
Outcome71 = 8.8, Outcome72 = 8.3, Outcome73 = 7.5, Outcome74 = 8.3, 
Outcome75 = 6.4, Outcome76 = 5.4, Outcome77 = 5.8, Outcome78 = 6.5, 
Outcome79 = 7.6, Outcome80 = 8.6, Outcome81 = 7.8, Outcome82 = 4, 
Outcome83 = 4.8, Outcome84 = 5.2, Outcome85 = 6.8, Outcome86 = 7.5, 
Outcome87 = 6.1, Outcome88 = 7.1, Outcome89 = 8, Outcome90 = 8.4, 
Outcome91 = 7.2, Outcome92 = 9, Outcome93 = 8.2, Outcome94 = 7.6, 
Outcome95 = 8.2, Outcome96 = 7.2, Outcome97 = 7.6, Outcome98 = 9.1, 
Outcome99 = 8, Outcome100 = 7.8, Outcome101 = 8.5, Outcome102 = 7, 
Outcome103 = 8, Outcome104 = 7.5, Outcome105 = 8.7, Outcome106 = 9.2, 
Outcome107 = 8.7, Outcome108 = 9.1, Outcome109 = 5.8, Outcome110 = 6.5, 
Outcome111 = 7.4, Outcome112 = 8, Outcome113 = 7.8, Outcome114 = 6.5, 
Outcome115 = 7.8, Outcome116 = 8.2, Outcome117 = 7.2, Outcome118 = 8.8, 
Outcome119 = 7.9, Outcome120 = 7.6, Outcome121 = 6.5, Outcome122 = 7.2, 
Outcome123 = 5.7, Outcome124 = 8.7, Outcome125 = 8.1, Outcome126 = 7.8, 
Outcome127 = 6.3, Outcome128 = 5.2, Outcome129 = 7.8, Outcome130 = 6.6, 
Outcome131 = 7.2, Outcome132 = 5.8, Outcome133 = 8, Outcome134 = 7.1, 
Outcome135 = 7.4, Outcome136 = 6, Outcome137 = 7.2, Outcome138 = 5, 
Outcome139 = 8, Outcome140 = 7.6, Outcome141 = 7.2, Outcome142 = 5.3, 
Outcome143 = 5, Outcome144 = 6, Outcome145 = 9, Outcome146 = 8.8, 
Outcome147 = 8.5, Outcome148 = 4.8, Outcome149 = 5.2, Outcome150 = 6, 
Outcome151 = 8.6, Outcome152 = 9, Outcome153 = 8, Outcome154 = 7.6, 
Outcome155 = 8, Outcome156 = 7.2, Outcome157 = 7.4, Outcome158 = 8.3, 
Outcome159 = 7, Outcome160 = 7.6, Outcome161 = 5.2, Outcome162 = 6.4, 
Outcome163 = 5.8, Outcome164 = 7.4, Outcome165 = 6.6, Outcome166 = 5.5, 
Outcome167 = 5, Outcome168 = 5, Outcome169 = 5, Outcome170 = 5.3, 
Outcome171 = 5.5, Outcome172 = 8.2, Outcome173 = 8.4, Outcome174 = 8.8, 
Outcome175 = 5.6, Outcome176 = 5.2, Outcome177 = 6.4, Outcome178 = 4.6, 
Outcome179 = 4.8, Outcome180 = 4, Outcome181 = 6.2, Outcome182 = 5.8, 
Outcome183 = 5.6, Outcome184 = 6, Outcome185 = 6.8, Outcome186 = 5.8, 
Outcome187 = 6.6, Outcome188 = 7.6, Outcome189 = 7.2), type = structure(c(1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("A", 
"B", "C"), class = "factor")), row.names = c(NA, -189L), class = c("tbl_df", 
"tbl", "data.frame"))

Captain Hat · Accepted Answer · 2022-11-29T15:35:42.970

2

You'll probably have an easier time if you do your summarising prior to your plotting. Then you can just name variables directly for your aesthetics instead of calculating 'on the fly'.

Presently your geom_text() is inheriting the x and y aesthetics of your data, which has multiple values for y that you've summarised whilst plotting.

I would instead do something like this:

library(dplyr)
library(ggplot2)
library(tidyr)

summarised <- 
  myData |> 
  group_by(type) |> 
  summarise(mean_se(varUnlist))

summarised
#> # A tibble: 3 × 5
#>   type      y  ymin  ymax
#>   <fct> <dbl> <dbl> <dbl>
#> 1 A      7.12  6.96  7.28
#> 2 B      7.28  7.11  7.45
#> 3 C      7.14  6.99  7.28

ggplot(summarised, aes(x = type, y = y, fill = type)) + 
  geom_bar(stat = "identity",) + 
  geom_errorbar(aes(ymin = ymin, ymax = ymax)) + 
  scale_fill_brewer() +
  geom_text(
    aes(label = round(y, 3)),
    nudge_y = 0.5) # so the labels don't hit the tops of the bars

^{Created on 2022-11-29 with reprex v2.0.2}

edited Nov 29 '22 at 15:35

answered Nov 29 '22 at 14:45

Captain Hat

2,444
1
14
31

thank you very much! now it's working. Silly question, but still: ```ymin``` and ```ymax``` equal to the minimum and maximum values of the sample, right? shouldn't I plot the confidence intervals or errors with the error bars? – Larissa Cury Nov 30 '22 at 17:58
1

No, `mean_se()` returns the mean (`y`) +/- multilples (default 1) of the standard error (`ymin`/`ymax`) – Captain Hat Dec 02 '22 at 09:39
1

So currently the error bars on the bar plot show a probable range of the population mean based on the values in your sample – Captain Hat Dec 02 '22 at 09:41

plot the mean on barplot without overlaping geom_text

1 Answers1