I have this summary dataframe (from this question):
lst <- lapply(1:ncol(mtcars), function(i){
x <- mtcars[[i]]
data.frame(
Variable_name = colnames(mtcars)[[i]],
sum_unique = NROW(unique(x)),
NA_count = sum(is.na(x)),
NA_percent = round(sum(is.na(x))/NROW(x),2))
})
do.call(rbind, lst)
Where I want to add the five highest and lowest values, for each column:
lst <- lapply(1:ncol(mtcars), function(i){
x <- mtcars[[i]]
data.frame(
variable_name = colnames(mtcars)[[i]],
distinct = NROW(unique(x)),
NA_count = sum(is.na(x)),
NA_percent = round(sum(is.na(x))/NROW(x),2),
first_5 = paste0(sort(x, decreasing=TRUE)[1:5],";"),
last_5 = paste0(sort(x)[1:5],";")
)
})
do.call(rbind, lst)
But it creates a new row for each first_5
and last_5
values. Why happens this? And how can I solve it?