-3

I'm using aggregate in R to try and summarize my dataset. I currently have 3-5 observation per ID and I need to average these so that I have 1 value (the mean) per ID. Some columns are returning all "NA" when I use aggregate.

So far, I've created a vector for each column to average it, then tried to use merge to combine all of them. Some columns are characters, so I tried converting them to numbers using as.numeric(as.character(column)), but that returns too many NA in the column.

library(dplyr)
Tr1 <-  data %>% group_by(ID) %>% summarise(mean = mean(Tr1))
Tr2 <-  data %>% group_by(ID) %>% summarise(mean = mean(Tr2))
Tr3 <-  data %>% group_by(ID) %>% summarise(mean = mean(Tr3))
data2 <- merge(Tr1,Tr2,Tr3, by = ID)

From this code I get error codes:

There were 50 or more warnings (use warnings() to see the first 50)

then,

Error in fix.by(by.x, x) : 
'by' must specify one or more columns as numbers, names or logical

My original dataset looks like:

ID Tr1 Tr2 Tr3
1 4 5 6
1 5 3 9
1 3 5 9
4 5 1 8
4 2 6 4 
6 2 8 6
6 2 7 4
6 7 1 9 

and I am trying to find a code so that it looks like:

ID Tr1 Tr2 Tr3
1 4   4.3 8
4 3.5 3.5 6 
6 3.7 5.3 6.3
Cae.rich
  • 171
  • 7

1 Answers1

0

You can use summarise_all instead of multiple uses of summarise:

library(dplyr)

data %>%
  group_by(ID) %>% 
  summarise_all(mean)

# A tibble: 3 x 4
     ID   Tr1   Tr2   Tr3
  <int> <dbl> <dbl> <dbl>
1     1  4     4.33  8   
2     4  3.5   3.5   6   
3     6  3.67  5.33  6.33
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56