1

I am lost trying to take a folder of csv files and merge them into a single data frame. The folders are numbered 1 to 332.csv in a folder (which is currently my working directory).

What I am trying to accomplish is a data frame I can extract the mean of a column of complete cases and a count of complete cases.

Here's where my code currently stands

# List a set of  the files
fileList = list.files(pattern="*.csv")

# Make data frame for each file
df = c(rep(data.frame(), length(fileList)))

# Read csv files into data frames
for (i in 1:length(fileList)) { df[[i]] <- as.list(read.csv(fileList[i])) }

#merge data frames into a single data frame
fullFrame <- rbind(df[[i]])

#isolate to just complete cases
completeFrame <- complete.cases(fullFrame)

fullFrame[completeFrame]

my expectation was to have a large table-like view of all the cases together, na cases not present.

Instead it outputs

> fullFrame[completeFrame]

[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
NULL

[[6]]
NULL

[[7]]
NULL

[[8]]
NULL

[[9]]
NULL

[[10]]
NULL

[[11]]
NULL

[[12]]
NULL

[[13]]
NULL

[[14]]
NULL

[[15]]
NULL

[[16]]
NULL
Frank
  • 66,179
  • 8
  • 96
  • 180
  • Is this not a duplicate of http://stackoverflow.com/questions/11433432/importing-multiple-csv-files-into-r ? – zx8754 Oct 16 '16 at 20:51
  • Something like: `do.call(rbind, lapply(list.files(), function(i){ x <- read.delim(i); complete.cases(x) })` ? – zx8754 Oct 16 '16 at 20:53
  • The answer on that question: temp = list.files(pattern="*.csv") myfiles = lapply(temp, read.delim) imports the 332 data frames into a list, so that solves my 1st half of the question fine, but I don't understand how bind them back together to, say, get a mean. I just have a list of 332 separate tuples. – Mathias Burton Oct 17 '16 at 00:22

1 Answers1

0

Even though you want a data.frame, data.table offers extremely fast and stupid-proof functions for dealing with this exact problem:

library(data.table)

fileList <- list.files(pattern="*.csv")
listing <- lapply(fileList, fread)
dt <- rbindlist(listing) # if unequal columns add ,fill = TRUE
dt <- na.omit(dt)
df <- as.data.frame(dt)
Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255