0

I want a function that I can call several times throughout a data analysis script, each time appending a new data frame to an existing list.

myList <- list()

The function creates a new data frame after subsetting an existing data frame, and then appends this new data frame to my list (in theory).

appendList = function(){
  df = mydf[mydf$myData < 0.5, ]
  myList[[(length(myList)+1)]] <- df
}

In my real-world problem I have several different code chunks, each with a different set of data in the column 'myData'.

I thought I could just use my function above like this:

mydf <- data.frame(myData = runif(10))
appendList()

mydf <- data.frame(myData = rnorm(10))
appendList()

But my list remains unchanged:

length(myList)
>[1] 0

Is it an environment issue?

My goal is for 'myList' to contain all of these different data frames.

Bonus: Perhaps there is a better way to complete this kind of task?

BDA
  • 65
  • 1
  • 6
  • 1
    You need to assign the results of `appendList` back to your original `myList` - like: `myList <- appendList()`. "Growing lists" is not the accepted R method though, and will be excruciatingly slow at times. Instead, make a `list` of all your subsets in one go using `lapply` or similar functions. – thelatemail Nov 20 '15 at 00:04
  • 1
    You'd need to change your line to `myList[[(length(myList)+1)]] <<- df`: note the `<<-`, which lets you change a global variable from within a function. But generally you don't want to build a list item by item: that's a slow operation in R. In fact, lists of data frames are usually a suggestion you're doing something wrong, especially if each comes from some filtering. Could you share a larger part of your analysis script? – David Robinson Nov 20 '15 at 00:08
  • 1
    @DavidRobinson - `<<-` doesn't definitely change variables within the global environment - see: http://stackoverflow.com/a/10904810/496803 – thelatemail Nov 20 '15 at 00:17
  • 1
    @thelatemail It does in this case, and I was oversimplifying since this isn't the best solution anyway. (btw, the solution of `myList <-` doesn't quite work because the value is not returned) – David Robinson Nov 20 '15 at 00:21
  • 1
    @DavidRobinson - true, the last line of `appendList` should be `return(myList)` or `myList`, then `myList <- appendList(...)` would work. – thelatemail Nov 20 '15 at 00:35
  • These are all helpful comments. Since I figured out how many data frames I'll need, I elected to make a list of subsets at the end of the script with `lapply` as @thelatemail suggests. – BDA Nov 20 '15 at 17:39

0 Answers0