2

I am trying to make an operation conditional on the name of a column in a data.table. With below example I try to illustrate what I mean. We have a DT with two columns carrot and banana. Each of these columns contains values. I want now that the carrot values are multiplied by 2 and that the banana values are divided by 2. My code, however, does not work, because names(.SD) is a vector of length 2 (names(DT)). is there a way I can make this work with lapply()?

carrot <- 1:5
banana <- 1:5

DT <- data.table(carrot, banana)

DT[, lapply(.SD, function(x) if(names(.SD) == 'carrot') {x * 2} else {x / 2}), .SDcols = names(DT)]
koteletje
  • 625
  • 6
  • 19

2 Answers2

4

The question/answer Access lapply index names inside FUN provided me with inspiration for a solution:

DT[, lapply(seq_along(names(.SD)),
            function(y, n, i) if(n[[i]] == 'carrot') {y[[i]] * 2} else {y[[i]] / 2},
            y = .SD,
            n = names(.SD)),
   .SDcols = names(DT)]
koteletje
  • 625
  • 6
  • 19
0

Do you have to do it in one operation? Multiple operations is cleaner I think e.g.

carrot <- 1:5
banana <- 1:5

DT <- data.table(carrot, banana)

# simplest way, assigning back to original value (or new columns)
DT[, carrot := carrot*2]
DT[, banana := banana/2]

# lapply way - do it twice
DT <- data.table(carrot, banana)
cols1 <- "carrot"
cols2 <- "banana"

# forms new unassigned tables 
DT[, lapply(.SD, function(x) x*2), .SDcols=cols1]
DT[, lapply(.SD, function(x) x/2), .SDcols=cols2]

# can also assign back in to DT
DT[, (cols1) :=  lapply(.SD, function(x) x*2), .SDcols=cols1]
DT[]
DT[, (cols2) := lapply(.SD, function(x) x/2), .SDcols=cols2]
DT[]
Jonny Phelps
  • 2,687
  • 1
  • 11
  • 20
  • @Johny Phelps, Yes preferably one operation. It is part of a bigger problem and getting the name is the last thing I need to solve my problem – koteletje Nov 27 '19 at 16:36