In this data frame there are two values for each id for all columns. I want to take the average of all columns for each id. My concern is how to handle NA. When there is NA, the other value should be reported.
id <- rep(1:3, each=2)
v1 <- c(1,2,5,NA,9,3)
v2 <- c(8,3,9,7,2,NA)
df <- data.frame(id, v1,v2)
df
id v1 v2
1 1 8
1 2 3
2 5 9
2 NA 7
3 9 2
3 3 NA
Expected outcome:
id <- c(1,2,3)
v1 <- c(1.5,5,6)
v2 <- c(5.5,8,2)
d <- data.frame(id,v1,v2)
d
id v1 v2
1 1 1.5 5.5
2 2 5.0 8.0
3 3 6.0 2.0
If I do like below, the ids and columns when there were one NA, will be filled as NA
newdf <- df %>% group_by(id) %>% summarise_each(funs(mean))
newdf
# A tibble: 3 x 3
id v1 v2
<int> <dbl> <dbl>
1 1 1.5 5.5
2 2 NA 8
3 3 6 NA