I have three data frames. One 'main' data frame to which I want to merge the other two sub data frames.
main = data.frame(a = LETTERS[1:5], b = round(rnorm(5),2))
a b
A 0.41
B -1.06
C -0.65
D -1.27
E 1.03
sub1 = data.frame(a = LETTERS[2:4], c = sample(c(100, 200, 300), 3, replace=T))
a c
B 200
C 200
D 200
sub2 = data.frame(a = LETTERS[c(1,5)], c = sample(c(999, 888), 2, replace=T))
a c
A 999
E 888
In the first merge, I want to merge sub1
to main
via "a"
and this works.
merged1 = merge(main, sub1, by="a", all.x=T)
a b c
A 0.41 NA
B -1.06 300
C -0.65 300
D -1.27 100
E 1.03 NA
Now I want to join/merge the sub2
to merged1
. In my case, the values provided by sub2
substitute the NAs in column c. Specifically I want to replace these NAs with the values from sub2
. I tried this (and other arguments for all.x
, all.y
:
merge(merged1, sub2, by="a", all.x=T)
a b c.x c.y
A 0.41 NA 999
B -1.06 300 NA
C -0.65 300 NA
D -1.27 100 NA
E 1.03 NA 999
How can I get this to be only one column c with the merge?