5

For example:

dt <- data.table()
x=1:5
> dt[,list(2,3,x)]
   V1 V2 x
1:  2  3 1
2:  2  3 2
3:  2  3 3
4:  2  3 4
5:  2  3 5

The resulting data.table has column x

For some reason, I would like to create a function to simplify data.table construction.

tt <- function(a, b, ...){
    list(a=sum(a), b=sum(b), ...)
}

> dt[,tt(1:2,1:3,x)]
   a b V3
1: 3 6  1
2: 3 6  2
3: 3 6  3
4: 3 6  4
5: 3 6  5

So whenever I call list, I use tt instead, so it auto inserts predefined columns for me. However, now it doesn't recognize the shortcut naming for x.

How to improve tt to auto name column like list in data.table if it is not too hard?

Aim

dt[,tt(1:2,1:3,x)]

Returns

   a b  x
1: 3 6  1
2: 3 6  2
3: 3 6  3
4: 3 6  4
5: 3 6  5

Solution

tt <- function(a, b, ...){
    dots <- list(...)
    inferred <- sapply(substitute(list(...)), function(x) deparse(x)[1])[-1]
    if(is.null(names(inferred))){
        names(dots) <- inferred
    } else {
        names(dots)[names(inferred) == ""] <- inferred[names(inferred) == ""]
    }
    c(a=sum(a), b=sum(b), dots)
}

dt <- data.table(c=1:5)
x=1:5

> dt[,tt(1:2,1:3,x,c+1)]
   a b x c + 1
1: 3 6 1     2
2: 3 6 2     3
3: 3 6 3     4
4: 3 6 4     5
5: 3 6 5     6
> dt[,tt(1:2,1:3,x, z=c+1)]
   a b x z
1: 3 6 1 2
2: 3 6 2 3
3: 3 6 3 4
4: 3 6 4 5
5: 3 6 5 6

Update

Recently I found that there was some bug in page 46 of S Programming from Venables & Ripley. I made some modifications and put it here. Hopefully it would be useful to some people.

# Get the best names vector for arguments like what data.frame does.
# Modified from page 46 of S Programming from Venables & Ripley.
# http://stackoverflow.com/questions/20545476/how-does-data-table-get-the-column-name-from-j
name.args <- function(...){
    # Get a list of arguments.
    dots <- as.list(substitute(list(...)))[-1]
    # Get names of the members if they have, otherwise "".
    # If a list have no named members, it returns NULL.
    nm <- names(dots)
    # If all arguments are named, return the names directly.
    # Otherwise it would cause a problem when do nm[logic(0)] <- list().
    if (!is.null(nm) && all(nm != ""))
        return(nm)
    # Handle empty argument list case.
    if (length(dots) == 0)
        return(character(0))
    # Get positions of arguments without names.
    fixup <- 
        if (is.null(nm))
            seq(along=dots)
        else
            nm == ""
    dep <- sapply(dots[fixup], function(x) deparse(x)[1])
    if (is.null(nm))
        dep
    else {
        nm[fixup] <- dep
        nm
    }
}

# Example
# x <- 1:2
# name.args(x, y=3, 5:6)
# name.args(x=x, y=3)
# name.args()
colinfang
  • 20,909
  • 19
  • 90
  • 173

2 Answers2

9

A simple solution would be to pass in additional arguments as named rather than unnamed arguments:

dt[,tt(1:2,1:3,x=x)]   ## Note that this uses `x=x` rather than just `x`
#    a b x
# 1: 3 6 1
# 2: 3 6 2
# 3: 3 6 3
# 4: 3 6 4
# 5: 3 6 5

Or for the truly lazy, something like this ;)

tt <- function(a, b, ...){
    dots <- list(...)
    names(dots) <- as.character(substitute(list(...))[-1])
    c(a=sum(a), b=sum(b), dots)
}
dt[,tt(1:2,1:3,x)]
#    a b x
# 1: 3 6 1
# 2: 3 6 2
# 3: 3 6 3
# 4: 3 6 4
# 5: 3 6 5
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • of course I know I can always make it explicit... I am looking for a lazy solution. – colinfang Dec 12 '13 at 15:21
  • OK -- that wasn't as clear from your pre-edit question. Added a solution that will work for unnamed arguments. (If you want to use a combination of named and unnamed additional arguments, you'll likely want to do a bit of additional processing to merge the supplied and implied names.) – Josh O'Brien Dec 12 '13 at 15:35
  • 1
    I prefer doing `as.character(substitute(list(...))[-1])` (note that the "list" part is irrelevant, anything will do there as long as it looks like a function) instead of that `sapply(match.call(...` since I've had some issues with the latter – eddi Dec 12 '13 at 16:38
  • @eddi -- Yes, thanks. Since your suggestion is cleaner, I'll incorporate it with an edit. – Josh O'Brien Dec 12 '13 at 18:57
  • 1
    @eddi -- I see that on page 46 of their MASS book (where they present a souped up version of the above, which handles named and unnamed arguments) Venables & Ripley use `sapply(dots, function(x) deparse(x)[1])`. I bet that `[1]` would avoid the problems that can otherwise arise from using `sapply()`... – Josh O'Brien Dec 12 '13 at 19:20
  • great. Any chance it can still support the named arguments? – colinfang Dec 13 '13 at 00:38
  • @colinfang -- Glad to hear. For the record, the solution I was thinking of was on page 46 of S Programming (not Modern Applied Statistics with S), also by Venables & Ripley. Not sure whether it's OK to post that, though, so I won't ;) – Josh O'Brien Dec 13 '13 at 02:55
  • 1
    have no access to the book, and google book preview is unavailable for page 46...(page 45,48 are both available). BUT, I managed to find this: http://www.codecollector.net/view/96FEA01C-0295-4147-B458-E499A8F86795-95744-00003EE46A956454#.Uqr4DfRdXqQ – colinfang Dec 13 '13 at 12:07
  • @colinfang -- That's the one. Google really is amazing. – Josh O'Brien Dec 13 '13 at 14:19
0

A simpler solution relies on tibble::lst:

library(data.table)

tt <- function(a, b, ...){
  tibble::lst(a=sum(a), b=sum(b), ...)
}

dt <- data.table(c=1:5)
x=1:5

dt[, tt(1:2, 1:3, x, c+1)]
#>    a b x c + 1
#> 1: 3 6 1     2
#> 2: 3 6 2     3
#> 3: 3 6 3     4
#> 4: 3 6 4     5
#> 5: 3 6 5     6
dt[, tt(1:2, 1:3, x, z=c+1)]
#>    a b x z
#> 1: 3 6 1 2
#> 2: 3 6 2 3
#> 3: 3 6 3 4
#> 4: 3 6 4 5
#> 5: 3 6 5 6
jan-glx
  • 7,611
  • 2
  • 43
  • 63