0

I'm currently trying to plot mean values of a variable pt for each combination of species/treatments in my experiments. This is the code I'm using:

ggplot(data = data, aes(x=treat, y=pt, fill=species)) +
 geom_bar(position = "dodge", stat="identity") +
 labs(x = "Treatment", 
      y = "Proportion of Beetles on Treated Side", 
      colour = "Species") +
 theme(legend.position = "right")

R output plot

As you can see, the plot seems to assume the mean of my 5N and 95E treatments are 1.00, which isn't correct. I have no idea where the problem could be here.

M--
  • 25,431
  • 8
  • 61
  • 93
  • Can you provide some data? – akash87 Jan 21 '20 at 18:04
  • 1
    Welcome to Stack Overflow! You should provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – M-- Jan 21 '20 at 18:09
  • @akash87 [link](https://1drv.ms/u/s!AjSCWBxDTEdkyy7Vv0YsCpFbge6q?e=OAKHk1) is the dataset – Sam Marchetti Jan 21 '20 at 18:19
  • 1
    The link in @M--'s comment has suggestions for how to include a sample of data in the post rather than at a third party site – camille Jan 21 '20 at 18:22

2 Answers2

1

Took a stab at what you are asking using tidyverse and ggplot2 which is in tidyverse.

dat %>% 
  group_by(treat, species) %>% 
  summarise(mean_pt = mean(pt)) %>% 
  ungroup() %>% 
  ggplot(aes(x = treat, y = mean_pt, fill = species, group = species)) + 
  geom_bar(position = "dodge", stat = "identity")+
  labs(x = "Treatment", 
       y = "Proportion of Beetles on Treated Side", 
       colour = "Species") +
  theme(legend.position = "right") +
  geom_text(aes(label = round(mean_pt, 3)), size = 3, hjust = 0.5, vjust = 3, position =  position_dodge(width = 1))

dat is the actual dataset. and I calculated the mean_pt as that is what you are trying to plot. I also added a geom_text piece just so you can see what the results were and compare them to your thoughts.

akash87
  • 3,876
  • 3
  • 14
  • 30
0

From my understanding, this won't plot the means of your y variable by default. Have you calculated the means for each treatment? If not, I'd recommend adding a column to your dataframe that contains the mean. I'm sure there's an easier way to do this, but try:

data$means <- rep(NA, nrow(data))
for (x in 1:nrow(data)) {
    #assuming "treat" column is column #1 in your data fram
    data[x,ncol(data)] <- mean(which(data[,1]==data[x,1]))
}

Then try replacing

geom_bar(position = "dodge", stat="identity")

with

geom_col(position = "dodge")

If your y variable already contains means, simply switching geom_bar to geom_col as shown should work. Geom_bar with stat = "identity" will sum the values rather than return the mean.

DanStu
  • 174
  • 9