0

How can I drop data frames with less than 3 variables? I tried this:

`1001.AFG.1.A`<-data.frame(x = 1, y = 1:10)
apply(ls(), function(x) {if (dim(x)[2]<3) rm(x)})

The error message is:

Error in match.fun(FUN) : argument "FUN" is missing, with no default
Cœur
  • 37,241
  • 25
  • 195
  • 267
GPierre
  • 893
  • 9
  • 25
  • 2
    try with `sapply` or `lapply`. `apply` is for `matrix` and waiting for a `"margin"` (1 for row-wise, 2 for col-wise). Also, you'll have to add a `get` to test the `dim` of the actual object, not just its name and you will very probably need to specify the environment – Cath Jan 28 '15 at 15:08
  • You may have to use `mget(ls())` and then use a condition to check whether it is a data.frame and have more than 3 variables within the list using `lapply`. get the names, and use `rm(list=.., envir=.GlobalEnv)` – akrun Jan 28 '15 at 15:12
  • 1
    @GPierre A similar question was answered by RichardScriven recently. Only difference being including the dimension check. Please check this link http://stackoverflow.com/questions/28142088/how-to-exclude-only-the-data-frames-from-the-global-environment-in-r/28142128#28142128 – akrun Jan 28 '15 at 15:32

2 Answers2

3

You may want to try :

sapply(ls(), function(x) {
                 if (is.data.frame(get(x)) && dim(get(x))[2]<3) rm(list=x,envir=.GlobalEnv)
             })

I you want to suppress the printings, you can do :

invisible(sapply(ls(), function(x) {
                 if (is.data.frame(get(x)) && dim(get(x))[2]<3) rm(list=x,envir=.GlobalEnv)
             }))
Cath
  • 23,906
  • 5
  • 52
  • 86
  • Thank you both for your suggestions. With this piece of code, I get an `argument is of length zero` error. My data frames names contains dots ex: `1001.AFG.1.A`. I know it is not the best. Could this impact the solution? – GPierre Jan 28 '15 at 15:28
  • @GPierre: problem was due to the fact that the second part of `if` condition was evaluated even for none `data.frame` objects (which might not even have a `dim`...), I changed `&` in `&&` which permits to evaluated second part of condition only if the first part is `true`. – Cath Jan 28 '15 at 15:37
  • @CathG The solution is not giving me the correct result whereas the one in the link removes the dataframes correctly – akrun Jan 28 '15 at 15:40
  • I created an object `dfN` with 2 columns. But even after running the code, the object remained – akrun Jan 28 '15 at 15:47
  • Your last edit worked. I already validated the answer of @G. Grothendieck because he found it first. But +1 for your solution as well. thanks. – GPierre Jan 28 '15 at 15:52
  • @akrun, weird thing : I tried my code with 4 elements (2 to delete, 2 to keep : a 4 col dataframe and a vector) in a R session and it worked... then (the weird part), after your comment, I did the same in another R session and I got some extra output and the function didn't work !... I changed `rm(x,...)` in `rm(list=x,...)` and it worked but still I have the extra ouput... – Cath Jan 28 '15 at 15:52
  • @GPierre, thanks for feedback (and+1), glad my solution finally worked ;-) – Cath Jan 28 '15 at 15:53
  • @CathG I was trying `rm(x,.. ` instead of `rm(list=` I think the recent edit works – akrun Jan 28 '15 at 15:54
  • 1
    @CathG It does print the return value of sapply ie. `NULL`. you can wrap it with `invisible(` if you don't want that – akrun Jan 28 '15 at 15:59
  • @akrun, right, thanks ! (now I remember added `invisible` to a `sapply` call to avoid extra printing but it was very long ago...) I'll add the "option" ! – Cath Jan 28 '15 at 16:01
3

1) The first line produces a named logical vector, to.rm with a component for each object which is TRUE if that object should be removed and FALSE otherwise. Thus names(to.rm)[to.rm] are the objects to be removed so feed that into rm. By splitting it into two steps, this lets one review to.rm before actually performing the rm.

to.rm <- unlist(eapply(.GlobalEnv, function(x) is.data.frame(x) && ncol(x) < 3))
rm(list = names(to.rm)[to.rm], envir = .GlobalEnv)

If this is entered directly into the global environment (i.e. not placed in a fucntion) then envir = .GlobalEnv in the last line is the default and can be omitted.

2) Another way is to iterate through the object names of env as shown. We have provided a verbose argument to show what it is doing and a dryrun argument to show what it would remove without actually removing anything.

rm2 <- function(env = .GlobalEnv, verbose = FALSE, dryrun = FALSE, all.names = FALSE) {
  for(nm in ls(env, all.names = all.names)) {
    obj <- get(nm, env)
    if (is.data.frame(obj) && ncol(obj) < 3) {
      if (verbose || dryrun) cat("removing", nm, "\n")
      if (!dryrun) rm(list = nm, envir = env)
    }
  }
}

rm2(dryrun = TRUE)  
rm2(verbose = TRUE)

Update Added envir argument to rm in (1). It was already in (2).

Update 2 Minor imrovements to (2).

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341