Along with the other problems in this question, you asked for help with three different objectives. In other words, you asked three questions in one. That's also frowned upon.
This code addresses your first objective:
library(tidyverse)
ID <- c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3)
X1<-c(1.1,0.2,0.4,0.8,1.3,2.3,1.1,3.2,NA,0.8,2.1,NA,1.1,0.2,0.4,0.8,NA,0.6)
X2<-c(0.8,NA,1.2,0.3,NA,NA,0.8,NA,1.5,2.7,2.2,NA,0.8,3.1,1.7,0.3,1.1,2.4)
X3<-c(0.1,0.3,1.1,2.2,0,NA,0.1,3.3,1.4,2.3,0,NA,NA,0.3,2.8,2.3,0,NA)
Time<-c("baseline","week1","week2","week3","week4","week5","baseline","week1","week2","week3","week4","week5","baseline","week1","week2","week3","week4","week5")
data<-data.frame(ID,X1,X2,X3,Time)
data %>% pivot_longer(cols=c(X1,X2,X3), names_to="Xtypes") %>%
group_by(ID, Time) %>%
summarize(sumNA=sum(is.na(value)), meanNA=mean(is.na(value)), sdNA=sd(is.na(value)))
# That returns the following:
`summarise()` has grouped output by 'ID'. You can override using the `.groups` argument.
# A tibble: 18 × 5
# Groups: ID [3]
ID Time sumNA meanNA sdNA
<dbl> <chr> <int> <dbl> <dbl>
1 1 baseline 0 0 0
2 1 week1 1 0.333 0.577
3 1 week2 0 0 0
4 1 week3 0 0 0
5 1 week4 1 0.333 0.577
6 1 week5 2 0.667 0.577
7 2 baseline 0 0 0
8 2 week1 1 0.333 0.577
9 2 week2 1 0.333 0.577
10 2 week3 0 0 0
11 2 week4 0 0 0
12 2 week5 3 1 0
13 3 baseline 1 0.333 0.577
14 3 week1 0 0 0
15 3 week2 0 0 0
16 3 week3 0 0 0
17 3 week4 1 0.333 0.577
18 3 week5 1 0.333 0.577
pivot_longer
changes the shape of your data frame, group_by
applies function(s) to the data grouped according to the variable(s) named, and summarize
is the verb that runs the function(s) therein. You asked for a sum ("number of"), mean, and sd.
You also wrote "... but when Time=baseline". I don't know what you mean by that. Were you looking only for when literally Time=="baseline"
? If that's the case, you want this instead:
data %>% pivot_longer(cols=c(X1,X2,X3), names_to="Xtypes") %>%
group_by(ID) %>%
filter(Time=="baseline") %>%
summarize(sumNA=sum(is.na(value)), meanNA=mean(is.na(value)), sdNA=sd(is.na(value)))
If you meant NOT when Time=baseline, change the ==
in filter
to !=
.