EDIT: small df added.
I have a small dataset called benthic_data
of some benthic invertebrate indices (only one metric included below).
Site <- c('S-7','S-7','S-7','S-7','S-7','S-27','S-27','S-27','S-27','S-27')
Sample <- c('S-7-1','S-7-2','S-7-3','S-7-4','S-7-5','S-27-1','S-27-2','S-27-3','S-27-4','S-27-5')
Abundance <- c(310, 316, 361,317, 321,108, 173, 189, 229, 210)
benthic_data <- data.frame(Site, Sample, Abundance)
I have made the Sample data as a factor and would like to generate a figure that has one point for each sample, followed by a mean (with standard deviation whiskers) for each site.
benthic_data$Sample = factor(benthic_data$Sample, levels=c('S-7-1', 'S-7-2','S-7-3','S-7-4','S-7-5','S-27-1','S-27-2','S-27-3','S-27-4', 'S-27-5'))
A basic plot of the sites and their respective abundance value works fine (I will make the figure prettier later):
ggplot(benthic_data, aes(x=Sample, y=Abundance, fill=Site))+
geom_point(data = benthic_data, size = 4.0, colour="black", shape=21, show.legend = F)+
scale_fill_manual(values = c("darkgreen", "orangered3"))
To calculate the mean and SD for each site I have used the following code in order to try and factor each site and I also want the mean/sd point for each site to be labelled S-7 Mean and S-27 Mean, respectively.
benthic_summary<- as.data.frame(benthic_data) %>%
group_by(Site) %>%
summarize(mean=mean(Abundance, na.rm=T),
sd=sd(Abundance, na.rm=T))
benthic_summary$Site = revalue(benthic_summary$Site, c("S-7" = "S-7 Mean","S-27"="S-27 Mean"))
benthic_summary$Site <- factor(benthic_summary$Site, levels= c("S-7 Mean","S-27 Mean"))
Now, to combine the 5 points for each site PLUS the mean/sd for each site I used geom_pointrange with the following code but I added two more colours in the scale_fill_manual because I got this error message: Error: Insufficient values in manual scale. 4 needed but only 2 provided.
So, this code works fine EXCEPT, I need to have the S-7 samples first (it is an upstream site) followed by the S-27 samples and the legend isn't reflective of the proper site colour.
Site S-7 should be green and site S-27 should be orangered.
AEMP_cols=c("darkgreen", "orangered3")
ggplot(benthic_data, aes(x=Sample, y=Abundance, fill=Site))+
geom_point(data = benthic_data, size = 4.0, colour="black", shape=21, show.legend = F)+
scale_fill_manual(values = c("darkgreen","darkgreen", "orangered3", "orangered3"))+
geom_pointrange(data = benthic_summary, aes(x = Site, y=mean, ymin=mean-sd, ymax=mean+sd), colour = AEMP_cols, size =1, shape = 15)
So, I would like help to figure out how to ensure that the order of samples (points) on the x-axis are: S-7-1, S-7-2 .... S-7-5, S7 Mean then S-27-1, S-27-2 .... S-27-5, S-27 Mean. Similar to what the code above created but having the S-7 sites samples first followed by the S-27 samples.
I can easily recreate the code for the other indices so I am just starting with Abundance for now.
Any help would be appreciated. Thanks.