0
DATA1Chamber
A tibble: 4 x 4
  genotype NovelMouseChamber CChamber NovelObjectChamber
  <chr>    <chr>                <chr>       <chr>                
1 EXP      457.457              54.4878     87.0871              
2 ctrl     129.596              146.413     323.023              
3 ctrl     306.306              73.7404     218.952              
4 ctrl     369.603              117.518     111.912 

I want a bar graph - with error bars
X axis has two groups = mean(EXP) vs Mean(ctrl)
y axis plotted against the 3 columns (NovelMouseChamber, CChamber, NovelObjectChamber)

attempted to adapt this example unsuccessfully using the following edits:

dfm<- melt(Data1Chamber[, c("genotype", "NovelMouseChamberCum","CChamberCum", "NovelObjectChamberCum")], id.vars= 1)

plotted using:

ggplot(dfm,aes(x = genotype, y = value)) + 
  geom_bar(aes(fill = variable),stat = "identity",position = "dodge")

enter image description here

Expected graph I do not have all the data collected so I would add many more EXP and CTRL data points and then try to get error bars. I was just trying to see if I could generate the graph with the data I had so far.

JConner16
  • 1
  • 1
  • Can you add the code you tried that has failed to work so far? My guess is your data may need some reshaping to get into an easy-to-use-with-ggplot2 format; see [here](https://stackoverflow.com/questions/10212106/creating-grouped-bar-plot-of-multi-column-data-in-r) for an example. – aosmith Oct 02 '18 at 18:30
  • Thank you for look at this! Honestly I'm pretty lost and do not really know where to start. I'll try to look at the example you posted – JConner16 Oct 02 '18 at 19:31
  • Looking at your sample data, I see that the first row is control and the remaining three rows are experiment. What do you want done with the three experiment groups? Do you want them plotted individually? If so, how do you want to distinguish between them? If not, how do you want to combine them - with `mean()`? – Gregor Thomas Oct 02 '18 at 20:22
  • I think your right and I need to format my data better. I think the variables I should be using are genotype location and time. I'm trying to clean my data with tidyr. I'll let you know if I make headway – JConner16 Oct 02 '18 at 20:22
  • I'm not sure I phrased the question correctly. I'll add a picture of what I want my graph to look like. Thank you for looking at this! – JConner16 Oct 02 '18 at 20:26
  • This is a great start. One issue I see is that your numeric variables are characters. While you likely want to figure out why R thinks these should be characters when they should be characters :), one way to get things worked out after the fact is to use `type.convert()`. That would look like `Data1Chamber = type.convert(Data1Chamber)`. Then you could `melt()` (and maybe summarize things?) prior to plotting. – aosmith Oct 02 '18 at 20:26

1 Answers1

1

This should get you started, and provide a good framework.

df = read.table(text = "genotype NovelMouseChamber CChamber NovelObjectChamber
1 EXP      457.457              54.4878     87.0871              
2 ctrl     129.596              146.413     323.023              
3 ctrl     306.306              73.7404     218.952              
4 ctrl     369.603              117.518     111.912", header = T)

dfm = reshape2::melt(df, id.vars = "genotype")

ggplot(dfm, aes(x = genotype, fill = variable, color = variable, y = value)) +
  stat_summary(geom = "bar", fun.y = mean, position = "dodge") +
  stat_summary(geom = "errorbar",
               fun.ymin = function(y) mean(y) - 1.96 * sd(y),
               fun.y = function(y) mean(y),
               fun.ymax = function(y) mean(y) + 1.96 * sd(y),
               position = position_dodge(width = 0.9),
               width = 0.3, color = "black")

enter image description here

Above, you can see I relied on ggplot to do the data manipulation - calculating the means and intervals for the errorbars. Generally, I recommend against that, I would prefer to use dplyr or data.table to do those calculations, then the plotting code would be more straightforward. If you create a data frame with columns y, ymin and ymax (in addition to your genotype, variable and value columns - whatever those are), then you can just use geom_bar and geom_errorbar without all the stat_summary complications.

As mentioned in comments, you'll also need to make sure your data is all of appropriate types - numeric data should be numeric, not character, before you do any plotting or calculations.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294