0

I used aggregate function in r to calculate the mean and sd for different group at different time points

aggregate(.~ Group + Time, data = x, FUN = function(x) c(m = mean(x), n = sd(x)))

I have a question, that this also gives me the mean and sd for the ID of the data, so my result looks like this:

#   Time Group ID.m ID.n result.m result.n
# 1   0   x     20.5    10.0     6.5    1.15
# 2   1   x     20.5    10.0     8.0    2.13
# 3   0   y     20.5    10.0     7.0    2.66
...

How can I remove the mean and sd for ID, and also I would like to make a plot of mean and mean+-sd for each group at different times(time as x-axis), how can I do this??

Cindy
  • 425
  • 1
  • 3
  • 10
  • 1
    Generally it looks something like `aggregate(. ~ cyl, mtcars, function(x){c(mean = mean(x), sd = sd(x))})`, but at some point this is the limitation that drives people to dplyr or data.table. – alistaire May 10 '17 at 20:23
  • how can I remove the result ID?@alistaire – Cindy May 10 '17 at 20:31
  • Don't include it on the left-hand side of your formula? It's hard to say without seeing the original data, which you should add [to make your example reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#5963610). – alistaire May 10 '17 at 20:44

1 Answers1

1

Consider using the tidyr package. It is included when you load the tidyverse library. The group_by and summarize functions replace your aggregate function. In my opinion, the pipe functions (%>%) are easier to read:

# Libraries
library(tidyverse)

result_table <- mydata %>%       # Specify your table
  group_by(Group, Time) %>%      # Specify your groups (two variables in your case)
  summarize(m = mean(x),         # Calculate mean for your groups
            n = sd(x))           # Calcualte sd for your groups

If all you want to do is remove columns from your result:

result_table %>% select(Time, Group, result.n, result.m) # using tidyr or
result_table[,c('Time', 'Group', 'result.n', 'result.m')] # Base R

To make your plot, you can use ggplot2 which is also included in tidyverse

ggplot(data = result_table) +
  geom_line(aes(time,m)) +
  geom_line(aes(time,n))
Jeff Parker
  • 1,809
  • 1
  • 18
  • 28