0

I have read over the responses to this question:

Using one data.frame to update another

And the answers seem to work, but only for integers. I am trying to update a certain data frame, called "original":

original = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , Value1 = c("Yellow",NA), Value2 = c(NA,"Blue") )

With a replacement data.frame that will overwrite some values:

replacement = data.frame( Name = c("Drug B") , Id = 2 , Value1 = "Red" , Value2 = "Orange")

And should end up looking like this:

goal = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , Value1 = c("Yellow","Red"), Value2 = c(NA,"Orange") )

Like the previous question, the solution should work for tables of arbitrary length and/or size. Any thoughts?

Community
  • 1
  • 1
knaslund
  • 33
  • 4
  • First convert the `factor` class to `character` class and then do `original[match(replacement$Name, original$Name), 3:4] <- replacement[, 3:4]` – akrun Mar 01 '16 at 17:53

1 Answers1

0

Merge with all.x=TRUE and replace the values for rows where either of the value columns are non-NA:

goal <- merge(original, replacement, by=c("Name", "Id") ,all.x=TRUE)
goal
#----
    Name Id Value1.x Value2.x Value1.y Value2.y
1 Drug A  1   Yellow     <NA>     <NA>     <NA>
2 Drug B  2     <NA>     Blue      Red   Orange

goal [ !is.na(goal$Value1.y)&!is.na(goal$Value2.y), c("Value1.x", "Value2.x")] <-
 goal [ !is.na(goal$Value1.y)&!is.na(goal$Value2.y), c("Value1.y", "Value2.y")]
 goal
 #------
Name Id Value1.x Value2.x Value1.y Value2.y
1 Drug A  1   Yellow     <NA>     <NA>     <NA>
2 Drug B  2      Red   Orange      Red   Orange

 goal[-(5:6)]
 #------
    Name Id Value1.x Value2.x
1 Drug A  1   Yellow     <NA>
2 Drug B  2      Red   Orange

As already mentioned this will need to be done with character vectors rather than factors. My data setup avoided this newbie pitfall (after I fell into it for the umpteenth time) with stringsAsFactors =TRUE:

 (original = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , 
                         Value1 = c("Yellow",NA), Value2 = c(NA,"Blue") , 
                         stringsAsFactors=FALSE))

(replacement = data.frame( Name = c("Drug B") , Id = 2 , Value1 = "Red" , 
                           Value2 = "Orange" , stringsAsFactors=FALSE))

(goal = data.frame( Name = c("Drug A","Drug B") , Id = c( 1 , 2) , 
                    Value1 = c("Yellow","Red"), Value2 = c(NA,"Orange") , 
                    stringsAsFactors=FALSE ))

This could have been done globally with (BEFORE the data.frame or other data-input calls). Some R shops run with this option all the time and if you are doing data management with R then I think it is wise to consider it as your modus operandi:

  options(stringsAsFactors=FALSE)
IRTFM
  • 258,963
  • 21
  • 364
  • 487