2

I have the following R code to load xts timeseries from multiple files and merge them in a single xts matrix:

load.files = function(dates, filenames) {
  for( i in 1:length(dates) ) {
  # load and merge each xts block
  ts.set = load.single.file(dates[i], filenames[i])
  if( i == 1 )
    ts.all = ts.set
  else
    ts.all = rbind(ts.all, ts.set)
}

return(ts.all)

Is there a way to

  1. Avoid the if/else statement required to initialize the very first ts.set?
  2. Avoid the for loop altogether?
Robert Kubrick
  • 8,413
  • 13
  • 59
  • 91

1 Answers1

3

I often use a construct like this, which avoids explicit loop construction.

The strategy is to first read the files into a list of data.frames, and to then rbind together the elements of that list into a single data.frame. You can presumably adapt the same logic to your situation.

filenames <- c("a.csv", "b.csv", "c.csv")
l <- lapply(filenames, read.csv)
do.call("rbind", l)
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • This seems to work, however I'm getting these warning messages on the call to lapply. What does it mean? Note that I'm using an extra parameter in my version: l = lapply(dates, load.ets.trades, filenames) Warning messages: 1: In file(file, "rt") : only first element of 'description' argument used 2: In file(file, "rt") : only first element of 'description' argument used – Robert Kubrick Jan 04 '12 at 23:29
  • 1
    filenames is a list/vector but load.ets.trades expects a single string? Using dates[i] with filenames[i] has a bad _smell_. Instead having them as two columns in a data frame/matrix feels more robust. (Then you use by() or apply(), see http://stackoverflow.com/questions/1699046/foreach-row-in-an-r-dataframe) – Darren Cook Jan 05 '12 at 01:47
  • 2
    As @DarrenCook mentions, that warning is telling you that on lapply iteration, `load.ets.trades()` is being passed the entire `filenames` vector. It uses only the first one, again and again, and warns you that it is doing so. The solution I'd use would be to use call `mapply()`, which is designed for just this situation. Your call will look something like `l <- mapply(FUN=load.ets.trades, date=dates, filename=filenames)`, where `date` and `filename` are the names of the formal arguments in `load.ets.trades`. Please let us know if this (or something else) solves your problem. – Josh O'Brien Jan 05 '12 at 05:10
  • @JoshO'Brien Thanks! I don't know if it helped Robert, but it helped me understand the problems mapply is designed to solve (far more than its help page or the books I've read did)! A quick test confirms it can be used with data frames too, as `mapply(FUN=load.ets.trades, date=d$dates, filename=d$filenames)` – Darren Cook Jan 05 '12 at 09:32
  • @JoshO'Brien Seems like mapply is exactly what I need, but I get an error: Error in dimnames(data) <- dimnames : length of 'dimnames' [1] not equal to array extent. I checked in the debugger, the problem occurs returning from the second (and last) call to load.ets.trades(): (date = dots[[1L]][[2L]], filename = dots[[2L]][[2L]]) – Robert Kubrick Jan 05 '12 at 15:18
  • @JoshO'Brien The last error only happens when I return an xts() object from the load.ets.trades() call. I mentioned I was using xts objects in the question but maybe it wasn't very clear from the code I posted. I might need to open a new question specifically on apply/mapply and xts objects. – Robert Kubrick Jan 05 '12 at 16:39
  • You might try setting the mapply argument `SIMPLIFY=FALSE`. Beyond that, I think you're right that a new question, including examples of the data objects you're loading, and how you are loading them is probably the way to go. Good luck! – Josh O'Brien Jan 05 '12 at 17:07