Looping assignment by reference through columns

Question

I found many similar questions but noone of them fits to my problem. I have two large dataframes with few common columns. I'm trying to assign by reference the values from the first df to the second.

I tried more combinations but noone works coorrectly, for example:

library(data.table)

#create dfs
set.seed(32)
DB <- data.frame(A=sample(c("A","B","C","D","E"),30,replace = T))
DB2 <- data.frame(A=sample(c("A","B","C","D","E","F","G"),60,replace = T),
                              B=rep(rnorm(60,mean=5)),
                              C=rep(rnorm(60,mean=10)))
#loop 
for (i in c("B","C")){
setDT(DB)[DB2,  i := i, on = .(A == A)]}

Thus, I'd like to loop the following code:

setDT(DB)[DB2, B := B, on = .(A == A)]
setDT(DB)[DB2, C := C, on = .(A == A)]
> DB
    A        B         C
 1: C 5.593191 10.697466
 2: C 5.593191 10.697466
 3: E 4.482933  8.726371
 4: D 5.454512 11.054162
 5: A 4.306571 11.427917
 6: E 4.482933  8.726371
 7: D 5.454512 11.054162
 8: E 4.482933  8.726371
 9: D 5.454512 11.054162
10: B 4.741633 10.846106
11: D 5.454512 11.054162
12: B 4.741633 10.846106
13: D 5.454512 11.054162
14: D 5.454512 11.054162
15: B 4.741633 10.846106
16: D 5.454512 11.054162
17: D 5.454512 11.054162
18: C 5.593191 10.697466
19: D 5.454512 11.054162
20: E 4.482933  8.726371
21: D 5.454512 11.054162
22: E 4.482933  8.726371
23: C 5.593191 10.697466
24: A 4.306571 11.427917
25: C 5.593191 10.697466
26: E 4.482933  8.726371
27: C 5.593191 10.697466
28: C 5.593191 10.697466
29: C 5.593191 10.697466
30: D 5.454512 11.054162
    A        B         C

any help would be very grateful

Joins are costly, no reason to do the same one twice... `[DB2, (cols) := mget(paste0("i.", cols)), on = .(A = A)]` should work. — Frank, Oct 16 '18 at 15:37

nghauran · Accepted Answer · 2018-10-16T16:42:53.470

Try out:

library(data.table)
#create dfs
set.seed(32)
DB <- data.frame(A=sample(c("A","B","C","D","E"),30,replace = T))
DB2 <- data.frame(A=sample(c("A","B","C","D","E","F","G"),60,replace = T),
                  B=rep(rnorm(60,mean=5)),
                  C=rep(rnorm(60,mean=10)))
#try
setDT(DB)[DB2, c("B", "C") := list(B, C), on = .(A == A)]
DB #output
    A        B         C
 1: C 5.593191 10.697466
 2: C 5.593191 10.697466
 3: E 4.482933  8.726371
 4: D 5.454512 11.054162
 5: A 4.306571 11.427917
 6: E 4.482933  8.726371
 7: D 5.454512 11.054162
 8: E 4.482933  8.726371
 9: D 5.454512 11.054162
10: B 4.741633 10.846106
11: D 5.454512 11.054162
12: B 4.741633 10.846106
13: D 5.454512 11.054162
14: D 5.454512 11.054162
15: B 4.741633 10.846106
16: D 5.454512 11.054162
17: D 5.454512 11.054162
18: C 5.593191 10.697466
19: D 5.454512 11.054162
20: E 4.482933  8.726371
21: D 5.454512 11.054162
22: E 4.482933  8.726371
23: C 5.593191 10.697466
24: A 4.306571 11.427917
25: C 5.593191 10.697466
26: E 4.482933  8.726371
27: C 5.593191 10.697466
28: C 5.593191 10.697466
29: C 5.593191 10.697466
30: D 5.454512 11.054162
    A        B         C

Update

Franck's suggestion should also work fine and is more efficient for a large number or columns (note that mget returns a named list)

cols <- colnames(DB2)[!(colnames(DB2) %in% colnames(DB))]
setDT(DB)[DB2, (cols) := mget(paste0("i.", cols)), on = .(A = A)]

thank you, however using it I should write 160 column names inside the list. Is there a more practical way to do it? — Matt_4, Oct 16 '18 at 16:11
In that case Franck's suggestion would be more relevant. I edited my answer — nghauran, Oct 16 '18 at 16:43

Looping assignment by reference through columns

1 Answers1