I have the following data frame
library(tidyverse)
ID <- c('A','A','B','C','D','E','F')
Level1 <- c(20,50,30,10,15,10,NA)
Level2 <- c(40,33,84,NA,20,1,NA)
Level3 <- c(60,40,60,10,25,NA,NA)
Grade1 <- c(20,50,30,10,15,10,NA)
Grade2 <- c(40,33,84,NA,20,1,NA)
DF <- data.frame(ID,Level1,Level2,Level3,Grade1,Grade2)
ID Level1 Level2 Level3 Grade1 Grade2
1 A 20 40 60 20 40
2 A 50 33 40 50 33
3 B 30 84 60 30 84
4 C 10 NA 10 10 NA
5 D 15 20 25 15 20
6 E 10 1 NA 10 1
7 F NA NA NA NA NA
My goal is to group the data by ID, summarize the columns with columnname containing the string "Level" by calculating the mean value. Ideally, the output should look something like this
ID mean (Level1+Level2+Level3)
A 40.5
B 58
C 10
....
Here is my code
DF %>%
group_by(ID) %>%
select(starts_with('Level')) %>%
summarise(mean(.,na.rm = TRUE))
When I run the code, I get the following output
Adding missing grouping variables: `ID`
# A tibble: 6 x 2
ID `mean(., na.rm = TRUE)`
<fct> <dbl>
1 A NA
2 B NA
3 C NA
4 D NA
5 E NA
6 F NA
Warning messages:
1: In mean.default(., na.rm = TRUE) :
argument is not numeric or logical: returning NA
2: In mean.default(., na.rm = TRUE) :
argument is not numeric or logical: returning NA
3: In mean.default(., na.rm = TRUE) :
argument is not numeric or logical: returning NA
4: In mean.default(., na.rm = TRUE) :
argument is not numeric or logical: returning NA
5: In mean.default(., na.rm = TRUE) :
argument is not numeric or logical: returning NA
6: In mean.default(., na.rm = TRUE) :
argument is not numeric or logical: returning NA
Cloud you please help me understand what wrong with my code. For proposed solutions 1) columns should be selected by matching column names against a string using functions like starts_with() or contains() in dplyr. 2) I would also like to avoid pivoting or gather functions if that possible.
I appreciate your help