I know there is a lot of questions on merge, which I screened. However maybe my english is just lacking the right words to phrase my to formulate my query correctly.
I have one data.frame with all the data of interested, except, its missing information for the last few rows on one variable. In the MRE below for the cases 01-04 & 01-05
Age is missing.
MRE
pacman::p_load(dplyr)
n_obs <- 5
left <- data.frame(Code = c(paste0(rep("01_0"),c(1:n_obs))),
Age = round(c(runif(n_obs-2,18,99),NA,NA)),
x1 = c(runif(n_obs,1,100)),
x2 = c(runif(n_obs,1,100))
)
left
Code Age x1 x2
1 01_01 45 66.56694 69.38042
2 01_02 85 22.98283 57.22248
3 01_03 91 35.59674 11.84955
4 01_04 NA 15.44669 34.97902
5 01_05 NA 41.52226 65.40847
The missing information in the left$Age
column is however available in a second data.frame, called right.
right <- data.frame(Code = c(paste0(rep("01_0"),c((n_obs-1):n_obs))),
Age = round(c(runif(2,18,99)))
)
right
Code Age
1 01_04 48
2 01_05 95
My goal is to fill the missings in left$Age
for 01-04 & 01-05
with the information from the second data.frame. My Approach was to use left_join()
, from dplyr. However this appends a new column Age.y
without filling the existing one.
left_join(left,right, by="Code")
Code Age.x x1 x2 Age.y
1 01_01 45 66.56694 69.38042 NA
2 01_02 85 22.98283 57.22248 NA
3 01_03 91 35.59674 11.84955 NA
4 01_04 NA 15.44669 34.97902 48
5 01_05 NA 41.52226 65.40847 95
The Question would be: Is there a way to directly fill the existing column of left
with the information from right
? Is there a better form of join / merge that accomplished that?
I am sorry if there is a well known function, as I think I lack the correct words to find what I am looking for.
Edit; Desired Output:
Code Age x1 x2
1 01_01 94 55.65129 59.62612
2 01_02 87 45.13744 37.49517
3 01_03 89 75.40357 93.33414
4 01_04 48 23.91269 65.42941
5 01_05 95 25.75679 58.18170