I'm trying to create a function to report number of missing values of one variable. My code is as below:
var1=c(1:10,rep(NA,5))
var2=c(rep(NA,10), 21:25)
var3=as.factor(c(rep(NA,3), rep("V",8), rep("U",4)))
var4<-c(rep(NA,3), 1:9,rep(NA,3))
data <- data.frame(var1,var2,var3,var4)
So, the codes below gave me what I want, which is the count of missing values in one variable given the other variable's not missing:
data %>% filter(!is.na(var3)) %>% summarise(Missing=sum(is.na(var2)))
data %>% filter(!is.na(var2)) %>% summarise(Missing=sum(is.na(var4)))
But when I put them in a function, it didn't work:
count_missing <- function(data,a,b) {
data %>% filter(!is.na(a)) %>% summarise(Missing=sum(is.na(b)))
}
For example:
count_missing(data,var2,var4)
should give me:
Missing
1 3
But instead, it returned:
Missing
1 6
My guess is the summarise(Missing=sum(is.na(b))
in the function doesn't take input from the pipeline.
Can anyone help me with this? Thank you!