1

My colleague and I are trying to create a stacked bar graph that is first grouped by RIL (on the x-axis), then by Trt, where the Trt (treatments) are clumped together and distinguished by colour. We also wish to label each of the bar graphs within each cluster by the treatment Trt.

The stacked bar graphs represent the calculated mean of SW_Before and SW_After (notice there are in the sample data there is one RIL, number 206, that has more than one row of data).

I originally thought to combine the two columns of data SW_Before and SW_After, however the control treatments of Trt do not contain data for SW_Before and SW_After but nevertheless must be included in the graph. Thus, a third column of data from SW_Total is graphed for each of the control clusters by RIL

I am relatively new to R as well as the realm of data organization so please excuse my amateur capabilities.

Below is a reproducible sample of my data:

data1 <- read.table (text= "Plant RIL Trt SW_Before SW_After SW_Total 
1 85 206 Early 0.380 2.27 2.65
2 88 166 Early 0 0.311 0.311
3 92 Lindo Early 0 0.663 0.633
4 94 158 Early 0.0738 0.596 0.669 
5 95 23 Early 0.0252 0.543 0.795
6 97 Lica Early 0 0.646 0.646
7 104 166 Peak 0.227 0.261 0.488
8 108 Lica Peak 0.0705 0.816 0.887
9 113 Lindo Late 0.628 0 0.628
10 115 206 Late 0.544 1.05 1.60
11 115 206 Control NA NA 1.50", sep="", header=T)    

I realize this graph is more difficult to create than I imagined so any assistance/direction will be most appreciated.

EDIT:

I am now trying to graph the average variable (which includes SW_Total, SW_Before and SW_After) by RIL and Trt. This is my code:

melted1 <- melt(data.baSW, id=c("Plant", "RIL", "Trt"))
melted1 <- subset(melted1, RIL %in% c("158", "166", "206", "23", "Licalla", "Lindo"))


melted1 %>%
  group_by(Trt, RIL, variable) %>%
  mutate(mean.SW_Total = mean(value)) %>%
  ggplot(aes(x = RIL, y = mean.SW_Total, fill = variable)) + 
  geom_bar(stat = 'identity', position = 'stack') + facet_grid(~ Trt)

EDIT 2

I have upgraded my code in respond to my EDIT #1. I believe this is the correct code but verification would be nice.

melted1 %>%
  ggplot(aes(x = RIL, y = value, fill = variable)) + 
  geom_bar(stat = 'summary', position = 'stack', fun.y = "mean") + facet_grid(~ Trt)
Cameron So
  • 139
  • 11

1 Answers1

0

I am not 100% sure I have interpreted your question correctly but I think this is close to what you want, adjusted from here.

library(reshape2) # for melt
library(tidyverse)

# convert all total values to 0 except that for the control ...
data1 <- data1 %>%
mutate(SW_Total = ifelse(Trt != "Control", 0, SW_Total))

#convert to long format
melted <- melt(data1, id=c("Plant","RIL","Trt"))

ggplot(melted, aes(x = RIL, y = value, fill = variable)) + 
  geom_bar(stat = 'identity', position = 'stack') + facet_grid(~ Trt) 

enter image description here

flee
  • 1,253
  • 3
  • 17
  • 34
  • This is exactly what I wanted. Except, the data from the Control cluster would come from data column `SW_Total`. How would I add this data for the selected RILs shown in the x-axis? Some data values from the control groups are from RILs that I do not want to include (e.g RIL = 199). Also - what is the format to dump multiple unused variables? – Cameron So Mar 23 '18 at 07:34
  • @Cam.S I have updated my answer, is this what you mean? – flee Mar 23 '18 at 07:45
  • see this post for dropping variables: https://stackoverflow.com/questions/4605206/drop-data-frame-columns-by-name – flee Mar 23 '18 at 07:47
  • That is what I mean. Hopefully there is another way to do that haha My dataset is much larger and the control rows are sparsed throughout. I will just reorder my dataset via excel. – Cameron So Mar 23 '18 at 08:07
  • @Cam.S I have updated my answer so that `SW_Total` will be replaced with `0` for all rows except `Control` automatically. – flee Mar 23 '18 at 08:19