0

I have a dataframe A:

ID A B C
2000 2 3 4
2001 2 7 2
2002 1 8 5
2003 6 2 3

And another dataframe B:

ID  B     
2001 0 
2003 0 

Expected output:

ID A B C
2000 2 3 4
2001 2 0 2
2002 1 8 5
2003 6 0 3

This is a large dataset and I want find all common IDs and replace a particular column with values from another dataframe.

How can I do this?

maximusdooku
  • 5,242
  • 10
  • 54
  • 94
  • 1
    Possible duplicate of [How to join (merge) data frames (inner, outer, left, right)?](http://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – nrussell Jan 10 '17 at 22:02
  • @nrussell How can I merge them since I only want to "insert" one of the columns? The columns are not the same in A and B. – maximusdooku Jan 10 '17 at 22:07
  • 1
    `dfA[match(dfB$ID, dfA$ID),names(dfB)[-1]] <- dfB[-1]` will work for any number of columns to replace. – Pierre L Jan 10 '17 at 22:07
  • Left join on ID and use the B values from dfB where the join constraint is satisfied, else use the B values from dfA. – nrussell Jan 10 '17 at 22:09

1 Answers1

0

Try this:

library(data.table)
dt1 = data.table(ID = c(2000,2001,2002,2003), A = c(2,2,1,6), B = c(3,7,8,2), C = c(4,2,5,3))
dt2 = data.table(ID = c(2001, 2003), B = c(0,0))
dt1 = merge(dt1,dt2,by = c("ID"),all.x = TRUE)

which gives us

     ID A B.x C B.y
1: 2000 2   3 4  NA
2: 2001 2   7 2   0
3: 2002 1   8 5  NA
4: 2003 6   2 3   0

Then, I'm not sure if you had a typo and really wanted to reset B, but if you did want to reset C to the B.y run this

dt1$C = ifelse(!is.na(dt1$B.y),dt1$B,dt1$C)

Which will give us:

      ID A B.x C B.y
1: 2000 2   3 4  NA
2: 2001 2   7 0   0
3: 2002 1   8 5  NA
4: 2003 6   2 0   0

Then, just drop B.y and change B.x back to B

dt1[,B.y:= NULL]
colnames(dt1)[3] = "B"

     ID A B C
1: 2000 2 3 4
2: 2001 2 7 0
3: 2002 1 8 5
4: 2003 6 2 0
Kristofersen
  • 2,736
  • 1
  • 15
  • 31