3

Consider this data.table:

 dt <- data.table(V1=c(3,2),V2=c(4,2),V3=c(6,5),V4=c(7,8),V5=c(9,10))
 > dt
    V1 V2 V3 V4 V5
 1:  3  4  6  7  9
 2:  2  2  5  8 10

I would like to create a new column that pastes the values and names of a subset of columns. The original columns should be retained.

The desired output is:

 > cols <- paste0("V",2:3)
 > dt
    V1 V2 V3 V4 V5    merged
 1:  3  4  6  7  9 V2:4 V3:6
 2:  2  2  5  8 10 V2:2 V3:5

I was able to get the output by sending .SD to a function and going through each row:

getSub <- function(SD){
            lapply(1:nrow(SD), function(s) paste0(cols,":", SD[s], collapse=" ")) 
          }
dt[, merged := list(getSub(.SD)), .SDcols=cols]
> dt
   V1 V2 V3 V4 V5              merged
1:  3  4  6  7  9 V1:3 V2:4 V3:6 V4:7
2:  2  2  5  8 10 V1:2 V2:2 V3:5 V4:8

I was wondering if there is a way to use something like dt[, merged := list(do.call(paste,c(c(cols,.SD),sep=" "))), .SDcols=cols] since it appears to be much faster https://stackoverflow.com/a/23819177/2070021?

Community
  • 1
  • 1

1 Answers1

3

Try this:

dt[, merged := do.call(paste, Map(function(x, y) paste(x, y, sep = ':'),
                                  names(.SD), .SD)),
     .SDcols = cols]

Another option is constructing the expression and evaluating it, but the above seems fast enough not to bother with that mess.

eddi
  • 49,088
  • 6
  • 104
  • 155