1

I need to create a new variable containing the value in one column in a data.frame depending on the value of another column, the example goes as follows:

library(data.table)

set.seed(pi)
DT <- data.table(
  X1 = LETTERS[1:10],
  X2 = letters[1:10],
  Z = sample(c("X1", "X2"), 10, replace = TRUE)
)

DT[]

This code generates the following

    X1 X2  Z
 1:  A  a X1
 2:  B  b X2
 3:  C  c X1
 4:  D  d X1
 5:  E  e X2
 6:  F  f X2
 7:  G  g X1
 8:  H  h X1
 9:  I  i X2
10:  J  j X2

Now I want to have a column W where if column Z is "X1" (or "X2") the content on the column X1 (or X2) is selected.

One solution can be:

DT[Z == "X1", W := X1]
DT[Z == "X2", W := X2]

But I would like to find a more elegant way to do this because I have many columns where I need to select one entry.

Thanks

Frank
  • 66,179
  • 8
  • 96
  • 180
Enrique Pérez Herrero
  • 3,699
  • 2
  • 32
  • 33

1 Answers1

4

We can use get after looping through sequence of rows

DT[, W :=  get(Z) , 1:nrow(DT)]

Or with eval(as.name

DT[,  W := eval(as.name(Z)) , 1:nrow(DT)]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Thank you, now the answer look easy but I was stuck in expressions like: `DT[, W := diag(as.matrix(DT[,DT[, Z], with = FALSE]))]` – Enrique Pérez Herrero Sep 14 '16 at 18:39
  • 1
    @EnriquePérezHerrero Creating a matrix and taking `diag` should be not efficient – akrun Sep 14 '16 at 18:42
  • Yes you lose the advantages of using a `data.table` – Enrique Pérez Herrero Sep 14 '16 at 18:43
  • 1
    @EnriquePérezHerrero Yes, that is one problem, second is that creating matrix with huge datasets can take more memory along with `diag` should be slow – akrun Sep 14 '16 at 18:44
  • 1
    @EnriquePérezHerrero If you are using `data.frame`, this can be made more efficient with row/column indexing i.e. `setDF(DT); DT$W <- DT[cbind(1:nrow(DT), match(names(DT)[-3], DT$Z))]` – akrun Sep 14 '16 at 18:48