6

I want to be able to update a column conditionally, where one condition is based on a table join on multiple keys. Is there another way than using ifelse to do this?

Example, where I want to update flag based on a join, but first exclude where status is not NA:

library(data.table)
dt_A <- data.table(key1 = LETTERS[1:6], key2 = (1:6), flag=NA_character_ )
dt_A[4, flag:="exclude"]
dt_B <- data.table(key1 = LETTERS[3:6], key2 = c(3:5,0L)) 

print(dt_A)
   key1 key2    flag
1:    A    1      NA
2:    B    2      NA
3:    C    3      NA
4:    D    4 exclude
5:    E    5      NA
6:    F    6      NA

print(dt_B)
   key1 key2
1:    C    3
2:    D    4
3:    E    5
4:    F    0

Using ifelse achieves the right result:

dt_A[dt_B, flag:=ifelse(is.na(flag), "include", flag), on = c("key1","key2"), nomatch=0L]

print(dt_A)
   key1 key2    flag
1:    A    1      NA
2:    B    2      NA
3:    C    3 include
4:    D    4 exclude
5:    E    5 include
6:    F    6      NA

It seems to me not the most efficient way of doing it.

Is there a way of excluding rows with status != NA first before assigning the values?

Frank
  • 66,179
  • 8
  • 96
  • 180
Shayne H
  • 83
  • 5
  • 2
    You could do the following `dt_B[, flag := "include"] ; dt_A[is.na(flag), flag := dt_B[copy(.SD), flag, on = c("key1", "key2")]]`. I think this is a dupe of http://stackoverflow.com/questions/29658627/conditional-binary-join-and-update-by-reference-using-the-data-table-package or of http://stackoverflow.com/questions/28889057/update-a-column-of-nas-in-one-data-table-with-the-value-from-a-column-in-another. Though I'm still not sure why I had to `copy` in this case while it works without a `copy` in the potential dupe. – David Arenburg Aug 12 '16 at 08:25
  • Interestingly, it worked for me without `copy`, using data.table 1.9 .6: `dt_A[is.na(flag), flag:= dt_B[.SD, flag, on = c("key1", "key2")]]` – Shayne H Aug 13 '16 at 12:33

0 Answers0