I have a data frame that looks something like this:
x y
1 a
1 b
1 c
1 NA
1 NA
2 d
2 e
2 NA
2 NA
And my desired output should be a data frame that should display the sum of all complete cases of Y (that is the non-NA values) with the corresponding X. So if supposing Y has 2500 complete observations for X = 1, and 557 observations for X = 2, I should get this simple data frame:
x y(c.cases)
1 2500
2 557
Currently my function performs well but only for a single X but when I mention X to be a range (for ex. 30:25) then I get the sum of all the Ys specified instead of individual complete observations for each X. This is an outline of my function:
complete <- function(){
files <- file.list()
dat<- c() #Creates an empty vector
Y <- c() #Empty vector that will list down the Ys
result <- c()
for(i in c(X)){
dat <- rbind(dat, read.csv(files[i]))
}
dat_subset_Y <- dat[which(dat[, 'X'] %in% x), ]
Y <- c(Y, sum(complete.cases(dat)))
result <- cbind(X, Y)
print(result)
}
There are no errors or warning messages but only wrong results in a range of Xs.