I need to take a directory full of files, read them, remove the NA values, and then keep only the files with above a certain number of rows, which will have correlations run on them. I have everything up to the subsetting of rows done, which I can't seem to manage.
corr <- function(directory, threshold = 0){
#reads directory of files
file_list <- list.files(path = getwd()
# takes file_list and makes each file into dataframe
dflist <- lapply(file_list, read.csv)
# returns list of files, na rows stripped
nolist <- lapply(dflist, na.omit)
# removes all with nrows < threshold
abovelist <- c()
for(file in nolist){
if (nrow(file) > threshold)
{append(abovelist, file)}
}
#
}
As you can see, I've tried using a for loop, appending those with nrow > threshold. But whenever I try running this step, all that returns is a NULL value in abovelist. I've noticed the following interaction with square brackets:
> nrow(nolist[1])
NULL
> nrow(nolist[[1]])
117
It seems like some functions access the dataframes in nolist as one-unit lists, and others actually get at the dataframes themselves (which is what I want here). How do I make sure to do this, here and in general?