I'm starting using R more and more frequently, coming from C/C++. For this reason, I often find myself thinking à la C++ when working with R's data structures.
Here I have two data.tables that I have to iterate through and update the value of column 1 and column 2 in table A with the value of column 2 in table B, according to column 1 table B w.r.t. columns 1 and 2 in table A.
Sorry for this confusing description. I try to make it better
I have two data tables (the number of rows is different because they could actually be different):
TabA
Col1 Col2
1: TP53 CD68
2: TP53 MPDU1
3: TP53 PHF2
4: TP53 KIAA0753
5: CD68 ZBTB4
6: CD68 CHD3
7: MPDU1 ZBTB4
8: MPDU1 CHD3
9: MPDU1 SLC2A4
10: MPDU1 YBX2
11: MPDU1 AURKB
12: MPDU1 TMEM132B
13: PHF2 C9orf129
14: PHF2 CDH23
15: PHF2 PTPDC1
and TabB:
Col3 Col4
1: ADAM32 0
2: ADARB2 1
3: AGBL2 2
4: ALOX12 3
5: ANKRD46 4
6: APOL1 5
7: APOOL 6
8: ASPA 7
9: AUH 8
10: AURKB 9
11: AUTS2 10
12: BAAT 11
So basically, I want to compare Col1 and Col2 from TabA with Col3 in TabB: if they are equal substitute the string with the number in Col4 of TabB.
My approach, definitely C-style:
for(i in 1:nrow(TabA)) {
for(j in 1:nrow(TabB)) {
if(TabA$Col1[i] == TabB$Col3[j]) {
TabA$Col1[i] <- TabB$Col4[j]
}
if(TabA$Col2[i] == TabB$Col3[j]) {
TabA$Col2[i] <- TabB$Col4[j]
}
}
}
This works as expected, but I am pretty sure there is a more elegant (and more efficient) way to do that, exploiting data.table's capabilities. Does anybody have a suggestion?
Thanks