1

I would like to have bars and errorbars for this data

I managed to get the bars with:

ggplot(FCDreach_global_mod, aes(x = as.factor(t3-t2), y = 1-value, fill=as.factor(t2-t1) )) + 
  geom_bar(stat = "identity" )

However I don't know how to draw the errorbars. I tried geom_errorbar() but couldn't get it work.


When drawing line plots I would use:

stat_summary(fun.data=mean_cl_normal, geom="errorbar")  

but this does not seem to work correctly with geom_bar()

I tried this:

ggplot(FCDreach_global_mod, aes(x = as.factor(t3-t2), y = 1-value, fill=as.factor(t2-t1) ) ) + 
  stat_summary(fun.y=mean,geom="bar")+
  stat_summary(fun.data=mean_cl_normal,geom="errorbar", width=0.5) 

and the breaks on the y looked quite different compared to the ones I got with geom_bar(stat = "identity" ). The size of the bars is the same, but something weird happens with the y scale.

geom_bar: enter image description here

stat_summary:

enter image description here

EDIT: the desired output is to show the equivalent of this plot in a barplot, of course excluding the x axis and placing t3-t2 on x

enter image description here

which I obtain by:

ggplot(FCDreach_global_mod, aes(x=roundedRealNumVehicles/2, y=1-value, colour=as.factor(t3-t2), lty=as.factor(t2-t1)) )  +
   stat_summary( fun.y=mean, geom="line" ) + 
   stat_summary(fun.data=mean_cl_normal,geom="errorbar", width=0.5) 
cross
  • 1,018
  • 13
  • 32
  • 1
    so, the bars are representing the sum of (1-value) aggregated by (t2-t1), is that what you are trying to show on the y-axis? What should the errorbars represent in that case? – Rorschach Aug 08 '15 at 15:41
  • Can you be more specific about your desired output? As I understand you want the first plot you show code for but with errorbars? What are the errorbars? Standard deviations? – mts Aug 08 '15 at 15:42
  • I want to have the confidence interval for the 95 percentile with the errorbars @nongkrong – cross Aug 08 '15 at 15:52
  • @nongkrong sorry if I was not clear. I want only two bars as it is shown in the figures above. these two bars should show the mean of the 1-value from the lineplot but to not consider the `roundedRealNumVehicles/2`. Then for the resulting bars, I want to have confidence intervals. Hope I clarified it now, sorry... – cross Aug 08 '15 at 17:42
  • 1
    what is wrong with the `stat_summary` graphic? it seems to do just what you want. – Rorschach Aug 08 '15 at 17:59
  • is it that you want something like `ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar(position="dodge")` ? – mts Aug 08 '15 at 18:35
  • @nongkrong I don't get where does this `0.075` comes from when using `stat_summary`. I think it is just too small to make sense in my data. How can I show percentages in the `y`, maybe than it will become clearer to me... – cross Aug 08 '15 at 18:46
  • @mts not sure if I understood what you mean... is there any link which shows what your command does? – cross Aug 08 '15 at 18:47
  • @cross it is still unclear what your desired output is, at least to me. What is wrong about that second plot you show (as nongkrong is asking)? The last plot is rather confusing me in terms of finding out about your desired result. It is also confusing me that `t2-t1` and `t3-t2` corresponds to the same selection so why do you highline both? (i.e. it would be enough to just categorize by `t2-t1` since from that value the value of `t3-t2` is defined). – mts Aug 08 '15 at 18:53
  • @mts you are probably right with all the questions you have raised, because the data I have shown here is a subset of a bigger dataFrame, so some stuff might appear odd. Lets narrow down my question to couple of basic things. 1) Why the output of `geom_bar` and `stat_summary` are different, aren't they doing the same task? doesn't `geom_bar` give the mean of the data too? 2) How to add errorbars for confidence intervals to `geom_bar` 3) how to show percentages on the `y` axis – cross Aug 08 '15 at 19:48
  • @cross maybe you could provide a more representative sample data? Anyway here my 2 cents on the questions you just raised: 1) your first `geom_bar` plot somehow sums over the two groups, that explains the numbers on the y-axis. 2) not as nice but calculate the aes outside of ggplot and `+geom_errorbar` should work 3) I don't understand, percentages of what? Finally take a look at this question and answer which could be related: http://stackoverflow.com/questions/11604070/issue-with-ggplot2-geom-bar-and-position-dodge-stacked-has-correct-y-values – mts Aug 08 '15 at 20:42

1 Answers1

3

In your first graph, the y-axis represents the (1-value) summed for each level of (t3-t2). In the second, the y-axis is the mean. So, manually you can see this by using aggregate to recreate these values,

## Question 1: what is the y-axis of the first plot?
## Aggregate by summing (1-value)
(p1 <- aggregate((1-value) ~ I(t3-t2), data=FCDreach_global_mod, sum))
#   I(t3 - t2) (1 - value)
# 1        0.4    19.51663
# 2        0.5    19.70297

## Question 2: where does the 0.075 come from in the stat_summary?
## Aggregate (1-value) taking the mean
(p2 <- aggregate((1-value) ~ I(t3-t2), data=FCDreach_global_mod, mean))
#   I(t3 - t2) (1 - value)
# 1        0.4  0.09119921
# 2        0.5  0.09038062

## Get normal confidence intervals
se <- with(FCDreach_global_mod,
           do.call(rbind,
                   lapply(split(1 - value, factor(t3-t2)), function(x)
                       mean(x) + c(-1,1)*sd(x)/sqrt(length(x))*qnorm(0.975))
                   ))


## Recreate barplot
dat <- setNames(p2, c("x", "y"))
dat <- cbind(dat, setNames(data.frame(se), c("ymin", "ymax")))

ggplot(dat, aes(x,y)) +
  geom_bar(stat="identity", aes(fill=factor(x))) +
  geom_errorbar(aes(x=x, ymin=ymin, ymax=ymax), color="black", width=0.05) +
  theme_bw()

enter image description here

Rorschach
  • 31,301
  • 5
  • 78
  • 129