2

I know there is a lot of questions on merge, which I screened. However maybe my english is just lacking the right words to phrase my to formulate my query correctly.

I have one data.frame with all the data of interested, except, its missing information for the last few rows on one variable. In the MRE below for the cases 01-04 & 01-05 Age is missing.

MRE

pacman::p_load(dplyr)
n_obs <- 5
left <- data.frame(Code = c(paste0(rep("01_0"),c(1:n_obs))),
                   Age = round(c(runif(n_obs-2,18,99),NA,NA)), 
                   x1 = c(runif(n_obs,1,100)), 
                   x2 = c(runif(n_obs,1,100))
                   )

left
   Code Age       x1       x2
1 01_01  45 66.56694 69.38042
2 01_02  85 22.98283 57.22248
3 01_03  91 35.59674 11.84955
4 01_04  NA 15.44669 34.97902
5 01_05  NA 41.52226 65.40847

The missing information in the left$Agecolumn is however available in a second data.frame, called right.

right <- data.frame(Code = c(paste0(rep("01_0"),c((n_obs-1):n_obs))),
                    Age = round(c(runif(2,18,99)))
                    )
right
   Code Age
1 01_04  48
2 01_05  95

My goal is to fill the missings in left$Age for 01-04 & 01-05 with the information from the second data.frame. My Approach was to use left_join(), from dplyr. However this appends a new column Age.y without filling the existing one.

left_join(left,right, by="Code")
   Code Age.x       x1       x2 Age.y
1 01_01    45 66.56694 69.38042    NA
2 01_02    85 22.98283 57.22248    NA
3 01_03    91 35.59674 11.84955    NA
4 01_04    NA 15.44669 34.97902    48
5 01_05    NA 41.52226 65.40847    95

The Question would be: Is there a way to directly fill the existing column of left with the information from right? Is there a better form of join / merge that accomplished that?

I am sorry if there is a well known function, as I think I lack the correct words to find what I am looking for.

Edit; Desired Output:

   Code Age       x1       x2
1 01_01  94 55.65129 59.62612
2 01_02  87 45.13744 37.49517
3 01_03  89 75.40357 93.33414
4 01_04  48 23.91269 65.42941
5 01_05  95 25.75679 58.18170
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
SysRIP
  • 159
  • 8
  • 2
    This operation is commonly referred to as 'update join'. Try a search "update join data.table R" and you will find several examples. Good luck! – Henrik Jul 08 '21 at 14:18
  • Thank you so much, that was what I was looking for. Btw to the moderator I would not mark duplicated to questions where no answer is accepted – SysRIP Jul 08 '21 at 14:19
  • 1
    Well, ["Accepting an answer is not mandatory"](https://stackoverflow.com/help/someone-answers), and it shouldn't be used to judge the "value" of an answer. – Henrik Jul 08 '21 at 14:24
  • 1
    And thanks a lot for a clear question with a nice small MRE and the code you tried! – Henrik Jul 08 '21 at 14:33
  • If you are interested in a tidyverse solution you can find one on this [SO question](https://stackoverflow.com/a/51577964/7318488) – Björn Sep 09 '21 at 09:10

0 Answers0