0

I have two dataframes that need to be merged on a single matching column and the second table needs to replace all values in the first table where the columns match.

I've tried several variations of the below code but nothing is returning a dataframe the same size as the original.

merge.data.frame(x, y, by.x = "Name", by.y = "Name")
merge.data.frame(x, y, by.x = "SN", all.x = FALSE,y all.y = TRUE)

The variables being used for this are shown below

x <- data.frame("SN" = 1:4, "Age" = c(21,15,44,55), "Name" = c("John","Dora", NA, NA))
y <- data.frame("SN" = 4, "Age" = c(100), "Name" = c("B"))
  SN Age Name       # x dataframe
1  1  21 John 
2  2  15 Dora 
3  3  44  NA 
4  4  55  NA

  SN Age Name       # y dataframe
5  4 100  B

The final result of the dataframe should be what is below:

Joined on "SN" and overwriting the values in columns "Age" and "Name" replacing what is in x with what is in y.

  SN Age Name       # result
1  1  21 John 
2  2  15 Dora 
3  3  44  NA 
4  4 100  B

Edit: If you have additional columns as shown in the dataframe below that are not in y

  SN Gender Age Name       # z table with additional column
1  1   M    21  John 
2  2   F    15  Dora 
3  3   M    44   NA 
4  4   M    55   NA

Running the code below from plyr package will drop unnecessary columns

library(plyr)
rbind.fill(z[!z$SN %in% y$SN,], y[,])
  SN Gender Age Name       # result if additional columns are present in z and not in y
1  1   M    21  John 
2  2   F    15  Dora 
3  3   M    44   NA 
4  4   M    100  B
William Humphries
  • 541
  • 1
  • 10
  • 21
  • 1
    Duplicate? https://stackoverflow.com/questions/6112260/conditional-merge-replacement-in-r – Cole Jun 16 '19 at 00:37
  • @Cole Yes, the questions posed are similar but the solutions for the question referenced replace only values for a single column, this will match for the column and bind the entire row to the dataframe – William Humphries Jun 16 '19 at 15:48

2 Answers2

4

You can remove SN values in x that match SN values in y then row bind the the two dataframes.

rbind(x[!x$SN %in% y$SN,], y) 

  SN Age Name
1  1  21 John
2  2  15 Dora
3  3  44 <NA>
4  4 100    B
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56
2

A longer solution that I find easier to read is to first filter rows in x that are presend in y, then add y to x.

# Added stringsAsFactors as False in df's.
x <- data.frame("SN" = 1:4, "Age" = c(21,15,44,55), "Name" = c("John","Dora", NA, NA), stringsAsFactors = F)
y <- data.frame("SN" = 4, "Age" = c(100), "Name" = c("B"), stringsAsFactors = F)

# Joins
x %>% 
  anti_join(y, by = c("SN")) %>%  # Remove row from x that are in y
  bind_rows(y)                    # Add rows from y to x
davsjob
  • 1,882
  • 15
  • 10