0

I would like to join similar data frames:

input:
x <- data_frame(a=c(1,2,3,4),b=c(4,5,6,7),c=c(1,NA,NA,NA))
y <- data_frame(a=c(2,3),b=c(5,6),c=c(1,2))

desired output:
z <- data_frame(a=c(1,2,3,4),b=c(4,5,6,7),c=c(1,1,2,NA))

I tried

x <- data_frame(a=c(1,2,3,4),b=c(4,5,6,7),c=c(1,NA,NA,NA))
y <- data_frame(a=c(2,3),b=c(5,6),c=c(1,2))

z <- merge(x,y, all=TRUE)

but it has one inconvenience:

  a b  c
1 1 4  1
2 2 5  1
3 2 5 NA
4 3 6  2
5 3 6 NA
6 4 7 NA

It doubles rows where there are similarities. Is there a way to get desired output without deleting unwanted rows?

EDIT

I can not delete rows with NA, x data frame consists of rows with NA which are not in y data frame. If I would do this I would deleted 4th row from x data frame (4 7 NA)

Thanks for help

piotr
  • 152
  • 1
  • 2
  • 13
  • @Jaap your example gives solution by deleting NA values. I can not do this in my example - rows with NA (no duplicated have to stay) – piotr Jan 20 '18 at 11:21
  • I've undeleted my answer. As you can see, it is the exact same approach as in the linked questions imo. – Jaap Jan 20 '18 at 11:25
  • thanks you are right your answer is exactly what I was looking for :) – piotr Jan 20 '18 at 11:27

1 Answers1

0

You can use an update join with the data.table package:

# load the packge and convert the dataframes to data.table's
library(data.table)
setDT(x)
setDT(y)

# update join
x[y, on = .(a, b), c := i.c][]

which gives:

   a b  c
1: 1 4  1
2: 2 5  1
3: 3 6  2
4: 4 7 NA
Jaap
  • 81,064
  • 34
  • 182
  • 193