1

I have hundreds of .csv files that are structured like this:

xyz25012013 <- data.frame(province = c("AB", "BC", "ON"), high = c(30, 20, 25), low = c(5, 2, 3))
xyz13122014 <- data.frame(province = c("AB", "BC", "ON"), high = c(20, 34, 25), low = c(1, 8, 3))
xyz30042014 <- data.frame(province = c("AB", "BC", "ON"), high = c(50, 21, 27), low = c(1, 9, 26))
xyz04072015 <- data.frame(province = c("AB", "BC", "ON"), high = c(26, 07, 90), low = c(4, 7, 3))

I want to import and merge/row bind all of them and retain the metadata date contained in the filename.

as.Date(substr(<filename>,4,11) format = "%d%m%Y")

I want the final output to look sort of like this:

date <- c(rep("25012013", 3), rep("13122014", 3), rep("30042014", 3), rep("04072015", 3))
xyz <- rbind(xyz25012013, xyz13122014, xyz30042014, xyz04072015)
xyz <- cbind(xyz, date)
xyz$date <- as.Date(xyz$date, format = "%d%m%Y")
print(xyz)
emehex
  • 9,874
  • 10
  • 54
  • 100
  • Looks like some variation of [this](http://stackoverflow.com/questions/8091303/merge-multiple-data-frames-in-a-list-simultaneously) – David Arenburg Jun 24 '15 at 17:48
  • how about `list.files(pattern="xyz")` get dates with `substr` and then `rbind` ,maybe with `tapply`/`lapply` – mts Jun 24 '15 at 18:06

2 Answers2

3

I think this does what you'd like and should be relatively efficient:

##  Create a file list to operate on:
files <- list.files(path=".", pattern="*.csv")

##  Read in our data from each CSV into a list structure:
csvs <- lapply(files, function(x) { 
  d <- read.csv(x); 
  d$date <- as.Date(substr(x,4,11), format="%d%m%Y"); 
  d 
})

##  rbind our CSV data together:
d <- do.call(rbind, csvs)

The result:

> head(d)
  X province high low      dates
1 1       AB   26   4 2015-07-04
2 2       BC    7   7 2015-07-04
3 3       ON   90   3 2015-07-04
4 1       AB   20   1 2014-12-13
5 2       BC   34   8 2014-12-13
6 3       ON   25   3 2014-12-13
Forrest R. Stevens
  • 3,435
  • 13
  • 21
2

Suppose all of your files are in the "test" file folder:

library(readr)
files = list.files("test/")
dd = vector("list", length = length(files))
for (i in seq_along(files)){
  dd[[i]] = read_csv(file = paste0("test/", files[i]))
  dd[[i]]$date = as.Date(substr(files[i], 4, 11), format = "%d%m%Y")
}

merged = do.call(rbind, dd)
Daijiang Li
  • 697
  • 4
  • 16