2

With the code below,

library(ggplot2)
load(url("http://murraylax.org/datasets/cps2016.RData"))

ggplot(df, aes(industry, usualhrs, fill=as.factor(sex))) +
  stat_summary(geom = "bar", fun = mean, position = "dodge", width=0.7) +
  stat_summary(geom = "errorbar", fun.data = mean_se, position = "dodge", width=0.7) + 
  stat_summary(aes(label = round(..y..,0)), fun = mean, geom = "text", size = 3, vjust = -1) +  
  xlab("Industry") + ylab("Usual Hourly Earnings") +  
  scale_x_discrete(labels = function(x) str_wrap(x, width = 12)) +
  theme(legend.position = "bottom") + 
  labs(fill = "Gender")  +
  theme_bw() 

I am producing this barplot (with error bars):

Output

The labels are centered according to the x-axis, but I would like to have the labels centered in each bar. In the first two bars, for example, I would like to have 27 at the center of the "Female" bar and 46 at the center of "Male" bar. I would also like to move the labels to the top of the error bars.

Thiago
  • 173
  • 11

1 Answers1

3

Add position = position_dodge(width = 1)) to your stat_summary(aes(label...)) call, outside of aes to move the labels above their respective bars.

To move the labels above the error bars I used geom_text with a y position slightly above the error bars, which required calculating the error bar position ahead of time using dplyr::summarize

library(dplyr)
df %>% 
  group_by(industry, sex) %>% 
  summarise(usualhrs_mean = mean(usualhrs, na.rm = TRUE),
            count = n(),
            usualhrs_se = sd(usualhrs, na.rm = TRUE)/sqrt(count)) %>% 
  ggplot(aes(x = industry, y = usualhrs_mean, fill = as.factor(sex))) +
  geom_bar(stat = "identity", position = position_dodge(width = 1)) +
  geom_errorbar(aes(ymin = usualhrs_mean - usualhrs_se,
                    ymax = usualhrs_mean + usualhrs_se), 
                position = position_dodge(width = 1)) +
  geom_text(aes(label=round(..y.., 0), y = (usualhrs_mean + usualhrs_se + 0.1)), vjust = -1.5, position = position_dodge(width = 1)) +
  scale_x_discrete(
    labels = function(x)
      str_wrap(x, width = 12)
  ) +
  coord_cartesian(ylim = c(0, 55)) +
  theme(legend.position = "bottom") +
  labs(fill = "Gender",
       y = "Usual Hourly Earnings")  +
  theme_bw() 

enter image description here

Greg
  • 3,570
  • 5
  • 18
  • 31
  • I got the following error with your code, @Greg: Error: n() should only be called in a data context – Thiago May 22 '20 at 14:55
  • 1
    @Thiago - I suspect this is a package conflict with something else you have loaded. Try specifying `dplyr::summarize`, `dplyr::n` etc. – Greg May 22 '20 at 14:59
  • how to keep the distance between bars the same as in the original plot using your code? I mean: the two bars per category (in the x-axis) together and the bars between categories more distant? – Thiago May 22 '20 at 15:15
  • @Thiago change `width` in `position = position_dodge()` to 0.7. – Greg May 22 '20 at 15:40