1

If I have a data table, foo, in R with a column named "date", I can get the vector of date values by the notation

foo[, date]

(Unlike data frames, date doesn't need to be in quotes).

How can this be done programmatically? That is, if I have a variable x whose value is the string "date", then how to I access the column of foo with that name?

Something that sort of works is to create a symbol:

sym <- as.name(x)
v <- foo[, eval(sym)]

...

As I say, that sort of works, but there is something not quite right about it. If that code is inside a function myFun in package myPackage, then it seems that it doesn't work if I explicitly use the package through:

myPackage::myFun(...)

I get an error message saying "undefined columns selected".

[edited] Some more details

Suppose I create a package called myPackage. This package has a single file with the following in it:

library(data.table)
#' export
myFun <- function(table1) {
    names1 <- names(table1)
    name1 <- names1[[1]]
    sym <- as.Name(name1)
    table1[, eval(sym)]
}

If I load that function using R Studio, then

myFun(tbl)

returns the first column of the data table tbl.

On the other hand, if I call

myPackage::myFun(tbl)

it doesn't work. It complains about

Error in .subset(x, j) : invalid subscript type 'builtin'

I'm just curious as to why myPackage:: would make this difference.

Daryl McCullough
  • 303
  • 5
  • 20
  • 2
    Try `x <- "date" ; foo[, x, with = FALSE]`, or just `foo[[x]]` – David Arenburg Sep 10 '14 at 23:01
  • Thank you! Just because I'm curious, do you have any idea why my foo[, eval(sym)] would work in some cases, but not others? It seems that I get different behavior if I call myFun(...) versus myPackage::myFun. I'm guessing that the :: screws up the namespace for symbols? – Daryl McCullough Sep 10 '14 at 23:08
  • That I can't help you with, sorry. I can bring attention to this question from one of `data.table` authors if you want, but I doubt he would be able to help with so little information. Either way, no reason for using neither `as.name` or `eval`. You can evaluate names within `data.table` just by putting them into `()` – David Arenburg Sep 10 '14 at 23:10
  • Without knowing what else is done in "myFun" and "myPackage" (in terms of creating a data.table aware package), the second part of the question is unanswerable. – mnel Sep 10 '14 at 23:14
  • I'm not used to StackOverflow. It would be enormously helpful if there was a preview function. – Daryl McCullough Sep 10 '14 at 23:47
  • ARRG! How do I create a code block????? The "Learn more.." is missing some very basic information, such as an EXAMPLE. – Daryl McCullough Sep 10 '14 at 23:55
  • I've tried every possible combination of indenting and [code] and and 'code' and ... Nothing comes out as code. – Daryl McCullough Sep 11 '14 at 00:04
  • You are not meant to place a lot of code in comments. Only in-line code is allowed in comments (which is quoted by back-ticks). Chucks should be edited into the question if appropriate. But your example shouldn't work at all. You must be relying on the fact that you have another variable in your environment that has a `name` attribute. So the irregular behavior is likely cause by duplicate variable names (which normally isn't a problem when you do things the "right way", which `as.name` and `eval` is not). – MrFlick Sep 11 '14 at 00:46
  • Does your package correctly import / depend the data.table package. Have a read of http://stackoverflow.com/questions/10527072/using-data-table-package-inside-my-own-package – mnel Sep 11 '14 at 02:25

2 Answers2

1

A quick way which points to a longer way is this:

subset(foo, TRUE, date)

The subset function accepts unquoted symbol/names for its 'subset' and 'select' arguments. (Its author, however, thinks this was a bad idea and suggests we use formulas instead.) This was the jumping off place for sections of Hadley Wickham's Advanced Programming webpages (and book).: http://adv-r.had.co.nz/Computing-on-the-language.html and http://adv-r.had.co.nz/Functional-programming.html . You can also look at the code for subset.data.frame:

> subset.data.frame
function (x, subset, select, drop = FALSE, ...) 
{
    r <- if (missing(subset)) 
        rep_len(TRUE, nrow(x))
    else {
        e <- substitute(subset)
        r <- eval(e, x, parent.frame())
        if (!is.logical(r)) 
            stop("'subset' must be logical")
        r & !is.na(r)
    }
    vars <- if (missing(select)) 
        TRUE
    else {
        nl <- as.list(seq_along(x))
        names(nl) <- names(x)
        eval(substitute(select), nl, parent.frame())
    }
    x[r, vars, drop = drop]
}

The problem with the use of "naked" expressions that get passed into functions is that their evaluation frame is sometimes not what is expected. R formulas, like other functions, carry a pointer to the environment in which they were defined.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
1

I think the problem is that you've defined myFun in your global environment, so it only appeared to work.

I changed as.Name to as.name, and created a package with the following functions:

library(data.table)
myFun <- function(table1) {
    names1 <- names(table1)
    name1 <- names1[[1]]
    sym <- as.name(name1)
    table1[, eval(sym)]
}
myFun_mod <- function(dt) {
    # dt[, eval(as.name(colnames(dt)[1]))]
    dt[[colnames(dt)[1]]]
}

Then, I tested it using this:

library(data.table)
myDt <- data.table(a=letters[1:3],b=1:3)
myFun(myDt)
myFun_mod(myDt)

myFun didn't work myFun_mod did work

The output:

> library(test)
> myFun(myDt)
Error in eval(expr, envir, enclos) : object 'a' not found
> myFun_mod(myDt)
[1] "a" "b" "c"

then I added the following line to the NAMESPACE file: import(data.table)

This is what @mnel was talking about with this link: Using data.table package inside my own package

After adding import(data.table), both functions work.

I'm still not sure why you got the particular .subset error, which is why I went though the effort of reproducing the result...

Community
  • 1
  • 1
geneorama
  • 3,620
  • 4
  • 30
  • 41