Conditionally remove data frames from environment

Question

How can I drop data frames with less than 3 variables? I tried this:

`1001.AFG.1.A`<-data.frame(x = 1, y = 1:10)
apply(ls(), function(x) {if (dim(x)[2]<3) rm(x)})

The error message is:

Error in match.fun(FUN) : argument "FUN" is missing, with no default

try with `sapply` or `lapply`. `apply` is for `matrix` and waiting for a `"margin"` (1 for row-wise, 2 for col-wise). Also, you'll have to add a `get` to test the `dim` of the actual object, not just its name and you will very probably need to specify the environment — Cath, Jan 28 '15 at 15:08
You may have to use `mget(ls())` and then use a condition to check whether it is a data.frame and have more than 3 variables within the list using `lapply`. get the names, and use `rm(list=.., envir=.GlobalEnv)` — akrun, Jan 28 '15 at 15:12
@GPierre A similar question was answered by RichardScriven recently. Only difference being including the dimension check. Please check this link http://stackoverflow.com/questions/28142088/how-to-exclude-only-the-data-frames-from-the-global-environment-in-r/28142128#28142128 — akrun, Jan 28 '15 at 15:32

Cath · Answer 1 · 2015-01-28T16:02:48.910

3

You may want to try :

sapply(ls(), function(x) {
                 if (is.data.frame(get(x)) && dim(get(x))[2]<3) rm(list=x,envir=.GlobalEnv)
             })

I you want to suppress the printings, you can do :

invisible(sapply(ls(), function(x) {
                 if (is.data.frame(get(x)) && dim(get(x))[2]<3) rm(list=x,envir=.GlobalEnv)
             }))

edited Jan 28 '15 at 16:02

answered Jan 28 '15 at 15:17

Cath

23,906
5
52
86

Thank you both for your suggestions. With this piece of code, I get an `argument is of length zero` error. My data frames names contains dots ex: `1001.AFG.1.A`. I know it is not the best. Could this impact the solution? – GPierre Jan 28 '15 at 15:28
@GPierre: problem was due to the fact that the second part of `if` condition was evaluated even for none `data.frame` objects (which might not even have a `dim`...), I changed `&` in `&&` which permits to evaluated second part of condition only if the first part is `true`. – Cath Jan 28 '15 at 15:37
@CathG The solution is not giving me the correct result whereas the one in the link removes the dataframes correctly – akrun Jan 28 '15 at 15:40
I created an object `dfN` with 2 columns. But even after running the code, the object remained – akrun Jan 28 '15 at 15:47
Your last edit worked. I already validated the answer of @G. Grothendieck because he found it first. But +1 for your solution as well. thanks. – GPierre Jan 28 '15 at 15:52
@akrun, weird thing : I tried my code with 4 elements (2 to delete, 2 to keep : a 4 col dataframe and a vector) in a R session and it worked... then (the weird part), after your comment, I did the same in another R session and I got some extra output and the function didn't work !... I changed `rm(x,...)` in `rm(list=x,...)` and it worked but still I have the extra ouput... – Cath Jan 28 '15 at 15:52
@GPierre, thanks for feedback (and+1), glad my solution finally worked ;-) – Cath Jan 28 '15 at 15:53
@CathG I was trying `rm(x,.. ` instead of `rm(list=` I think the recent edit works – akrun Jan 28 '15 at 15:54
1

@CathG It does print the return value of sapply ie. `NULL`. you can wrap it with `invisible(` if you don't want that – akrun Jan 28 '15 at 15:59
@akrun, right, thanks ! (now I remember added `invisible` to a `sapply` call to avoid extra printing but it was very long ago...) I'll add the "option" ! – Cath Jan 28 '15 at 16:01

G. Grothendieck · Accepted Answer · 2015-01-28T15:57:56.897

1) The first line produces a named logical vector, to.rm with a component for each object which is TRUE if that object should be removed and FALSE otherwise. Thus names(to.rm)[to.rm] are the objects to be removed so feed that into rm. By splitting it into two steps, this lets one review to.rm before actually performing the rm.

to.rm <- unlist(eapply(.GlobalEnv, function(x) is.data.frame(x) && ncol(x) < 3))
rm(list = names(to.rm)[to.rm], envir = .GlobalEnv)

If this is entered directly into the global environment (i.e. not placed in a fucntion) then envir = .GlobalEnv in the last line is the default and can be omitted.

2) Another way is to iterate through the object names of env as shown. We have provided a verbose argument to show what it is doing and a dryrun argument to show what it would remove without actually removing anything.

rm2 <- function(env = .GlobalEnv, verbose = FALSE, dryrun = FALSE, all.names = FALSE) {
  for(nm in ls(env, all.names = all.names)) {
    obj <- get(nm, env)
    if (is.data.frame(obj) && ncol(obj) < 3) {
      if (verbose || dryrun) cat("removing", nm, "\n")
      if (!dryrun) rm(list = nm, envir = env)
    }
  }
}

rm2(dryrun = TRUE)  
rm2(verbose = TRUE)

Update Added envir argument to rm in (1). It was already in (2).

Update 2 Minor imrovements to (2).

Conditionally remove data frames from environment

2 Answers2