1

My global environment contains several dataframes. I want to execute functions on only those that contain a specific string in their name. So, I first create a list of these dataframes of interest:

dfs <- ls()[sapply(ls(), function(x) class(get(x))) == 'data.frame']
dfs <- as.data.frame(dfs)
dfs_lst <- agrep("stats", dfs$dfs, ignore.case=FALSE, value=TRUE, 
    max.distance=0.1, useBytes=FALSE)

dfs_lst correctly returns all dataframes in my global environment containing the string "stats". dfs_lst

chr [1:3] "stats1" "stats2" "stats3".

Now, I want to execute functions on these 3 dataframes, however I do not know how to call them from the dfs_lst. I want something of the kind:

for(i in 1:length(dfs_lst){
   # Find dataframe name in dfs_lst, and then use the matching dataframe in
   # global environment. So, something of the sort:
   for(dfs_lst[i] in ls()){
        result[i,] <- dfs_lst[i] %>% 
                                 summarise(. , <summarise stuff> )
   }
}

For example, for i=1, dfs_lst[1] is dataframe "stats1", I would want to perform the following, and save it in the first row of "results":

   for(stats1 in ls()){
        result[1,] <- stats1 %>% summarise(. , <summarise stuff> )
   }
oguz ismail
  • 1
  • 16
  • 47
  • 69
  • 10
    You should store such data.frames in a list. Then you wouldn't have such problems. See the following [post](http://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207) for examples and details. It seems weird at first, but makes many operations much easier. – lmo May 20 '16 at 15:40
  • 4
    If you already have the names of your target objects, you can just do something like `lapply(mget(dfs[,1], envir = .GlobalEnv), function(x) { ... })`. – nrussell May 20 '16 at 15:43
  • 2
    In a pinch there's always `substitute(x, list(x = dfs_lst[1])` or `eval(parse(text = dfs_lst[1]))`, but putting them in a list is a better option before you resort to such shenanigans. [Some sort-of related reading.](http://adv-r.had.co.nz/Computing-on-the-language.html) – alistaire May 20 '16 at 15:54

2 Answers2

3

As @lmo pointed out, it's probably best to store these data.frames together in a single list. Instead of having data.frame objects called "stats1", "stats2", etc, floating around in your environment, a (hacky) way to store all your data.frame objects in a list is this:

dfs <- ls()[sapply(ls(), function(x) class(get(x))) == 'data.frame']

##make an empty list
my_list <- list()
##populate the list
for (dfm_name in dfs) {
   my_list[[dfm_name]] <- get(dfm_name)
}

Now you've got a list my_list containing every object of the class data.frame in your environment. This will probably be helpful when you want to work with all data.frames names "statsX":

##find all list objects whose name starts with "stats"
stats_objects <- substr(names(my_list),1,5)=="stats"
results <- matrix(NA, ncol = your_length, nrow = sum(stats_objects))
##now perform intended operations
for ( row_num in 1:nrow(results)) {
  results[i,] <- my_list[stats_objects][[row_num]] %>% 
                             summarise(. , <summarise stuff> )
}

This should perform as necessary, after a couple alterations in the code (e.g. your_length needs to be specified, and you wanted all objects whose name contains "stats" so you'll need to work with regularized expressions).

What's nice about this is my_list contains all the data.frames, so if you choose to run analysis on data.frames not named "stats" you can still access them with a similar procedure. Hope this helps.

BarkleyBG
  • 664
  • 5
  • 16
  • This can be a bit more cleanly with `apropos` and `mget`; something like `lapply(Filter(is.data.frame, mget(apropos("stats", mode = "list"), .GlobalEnv)), function (x) { ... })`. – nrussell May 20 '16 at 16:31
0

As discussed in the comments, if we have a list of interesting data frames, it will be easier to deal with the elements as data frame. So, the main issue here seems to be having just the object names and not the actual data.frame objects.

In order to follow the code and tracking the data types, I have decomposed it first:

1.

    env.list <- ls() # chr vector

2.

    env.classes <- sapply(env.list, function(x) class(get(x))) 
    # list of chr (containing classes), element names: data frame names

3.

   dfs <- env.list[env.classes == 'data.frame'] # chr vector

4.

    dfs <- as.data.frame(dfs) 
    # data frame with one column (named "dfs"), containing data.frame names

Now, we can get the list of data.frames:

3.

   dfs <- env.list[env.classes == 'data.frame'] # chr vector
   dfs.list <- sapply(dfs, function(x) {get(x)})

grep can be applied now to names(dfs.list) to get the interesting data frames.

pbahr
  • 1,300
  • 12
  • 14