I am attempting to write a function which loops through the columns and will perform different operations depending on the data type of the column. It is failing to enter the if statements since is.numeric(data[,i]) returns FALSE regardless of whether its corresponding is.numeric(data$variable) returns TRUE. I am not sure the best way to approach this problem. Please let me know if you can help. Thanks!
Here is the function:
get_summary_stats <- function(data) {
results <- list()
for (i in names(data)) {
var.name <- names(data[,i])
if (is.numeric(data[,i])) {
med.est <- median(data[,i])
min.est <- min(data[,i])
max.est <- max(data[,i])
mean.est <- mean(data[,i])
SD <- sd(data[,i])
num.na <- sum(is.na(data[,i]))
results[[i]] <- c(var.name, num.na, mean.est, SD, med.est, min.est, max.est)
}
if (is.factor(data[,i])){
var.lables <- levels
counts <- as.data.frame(table(data[,i]))
total <- sum(counts$Freq)
num.na <- c("NA", nrow(data) - total)
counts <- rbind(counts, num.na)
counts$Percent <- (counts$Freq / total) * 100
results[[i]] <- counts
}
}
return(results)
}
Here is an example of the issue:
> is.numeric(full_data[,"Patient Age [70: Age]"])
[1] FALSE
> is.numeric(full_data$`Patient Age [70: Age]`)
[1] TRUE