1

I would like to swiftly replace NA values in the vector id in first df based on x0 column value in second df without long steps of merging and deleting columns. In other words x0 is the common record identifier. Simple example is following:

first <- data.frame(x0=1:5,
                     x3=90:94,
                     id=c("a1","a2","a3",NA,NA))`
  x0 x3   id
1  1 90   a1
2  2 91   a2
3  3 92   a3
4  4 93 <NA>
5  5 94 <NA>`
second <- data.frame(x0=4:8,
                     x2=seq(104,108,1),
                     id=c("a4","a5","a6","a7","a8"))`
  x0  x2 id
1  4 104 a4
2  5 105 a5
3  6 106 a6
4  7 107 a7
5  8 108 a8`

The expected result would be:

  x0 x3 id
1  1 90 a1
2  2 91 a2
3  3 92 a3
4  4 93 a4
5  5 94 a5`
Matt
  • 2,947
  • 1
  • 9
  • 21
MIH
  • 1,083
  • 3
  • 14
  • 26

1 Answers1

1

You are looking for a update-join or a gap-fill-join which could be done like:

first$id  <- as.character(first$id)
second$id  <- as.character(second$id)

idx <- match(first$x0, second$x0) #Update join
idxn <- which(!is.na(idx))
first$id[idxn]  <- second$id[idx[idxn]]
first
#  x0 x3 id
#1  1 90 a1
#2  2 91 a2
#3  3 92 a3
#4  4 93 a4
#5  5 94 a5
GKi
  • 37,245
  • 2
  • 26
  • 48