Barplot mean /w SD in R-Project

Question

Sounds like a trivial one, but some research didn´t come up with an elegant solution: I have a dataframe structured with a categorial variable (GROUP) and a continuous read-out variable (bloodpressure). How can a make a simple box-plot showing the mean for each group with its standard deviation? There are multiple groups: A,B,C,D How can I perform an ANOVA post-hoc analysis within the dataframe. How does it work with Mann-Whitney-U-Test? Can I mark the significance level in the bar-plot? How can I streamline this operation to multiple continuous variables (dia_bloodpressure, sys_bloodpressure, mean_bloodpressure) and sink() the output in different files (by name of the variable)?

this is a bit much for one question. perhaps you should have a look at http://stackoverflow.com/faq#questions and http://stackoverflow.com/q/5963269/1317221 and then streamline your question somewhat — user1317221_G, Sep 17 '12 at 16:33
ok, I guess it´s a little bit much for one posting. But then: this is the typical workflow of analysis. So far I encountered packages dealing with one of the problems. It´s 1) multiple group testing 2) very rarely multiple group comparison 3)barplots of multiple groups , but never with significance levels. — Doc, Sep 17 '12 at 17:38
can you give a reproducible example?? http://stackoverflow.com/questions/2286085/plotting-of-multiple-comparisons-in-r — Ben Bolker, Jan 05 '13 at 16:48

score 0 · Accepted Answer · answered Sep 17 '12 at 17:40

After some research I came up with the agricolae package. This one provides multiple group comparison. The resulting objects can be pipelined into a decent plotting function for groupwise bar-graphs +/- SD or SEM. Unfortunately, no way to use markers of significance between groups in the plots.

score 0 · Answer 2 · answered Jan 05 '13 at 16:42

After some more programming in R, I stumbled over another nice package suitable for medical research: psych. Considering the question above, describe() and describeBy() get statistical overview of a dataframe and sort it by a grouping variable. The function error.bars.by() is an advanced plotting function for mean values +/- SD. The package offers many functions on covariate analysis, which are useful in psychological research but might also help for medical and marketing research.

score 0 · Answer 3 · answered Jan 05 '13 at 17:08

A possible code snippet:

library(psych)

x<-c(1,2,3,4,5,6,7,8,9,NA)
y<-c(2,3,NA,3,4,NA,2,3,NA,2)
group<-rep((factor(LETTERS[1:2])),5)
df<-data.frame(x,y,group)
df

by(df$x,df$group,summary)
by(df$x,df$group,mean)

sd(df$x) #result: NA
sd(df$x, na.rm=TRUE) #result: 2.738613

v = c("x", "y")#or
v = colnames(df)[1:2]
sapply(v, function(i) tapply(df[[i]], df$group, sd, na.rm=TRUE))

describeBy(df$x, df$group)

error.bars.by(df$x, df$group, bars=TRUE)

Barplot mean /w SD in R-Project

3 Answers3