4

Suppose I have data frame df1:

df1
    R1
1   A
2   B
3   C
4   D
5   E
6   F
7   G
8   H

And another data frame df2:

df2
    R1  R2
1   A   2
2   B   5
3   D   7
4   E   9
5   F   12
6   J   16

How would I make a new column in df1, named R2, that would ascribe the right value from df2$R2 to df1$R2 according to the matching entry from df2$R1? Any way I try to do this I end up getting errors related to the difference in lengths. However, is there a way I can force the values from df2$R2 into a new column in df1 and just have NAs (or NaNs or whatever) in any rows in df1 where no corresponding value exists in df2$R1? And also have it ignore entries in df2 for which there is no corresponding row in df1 (for example, row 6, where R1=J, because there is no J in df1$R1). For my example, the data set I want would look like this:

    R1  R2
1   A   2
2   B   5
3   C   NA
4   D   7
5   E   9
6   F   12
7   G   NA
8   H   NA

So basically, df1$R2 should equal what df2$R2 equals if df2$R1 is the same as df1$R1. Sorry if this has been asked before, I couldn't find it if it has. Thanks.

user3593717
  • 131
  • 6
  • You didn't share any of the code you tried. Did you try a `merge()`? That's basically what you are doing. That should have come up before you posted the question. – MrFlick Jun 07 '14 at 23:50
  • `merge` does not accomplish what was requested. I agreed that it seemed reasonable that is was a duplicate and it may still be. I searched with `[r] data.frame columns match` and with `[r] lookup data.frame` and reviewed many of the search results. (And I could have closed it with one vote but therefore didn't.) Unless someone can really find a true duplicate then I think the user community should do the courteous thing and reopen. – IRTFM Jun 08 '14 at 15:14

2 Answers2

2

This is a task for the match function. It's primary use is to generate the appropriate index for use with the "[" function, i.e. for selection of other elements or rows. It is a key function in the merge operation, but a full merge was not really what you were asking for:

 df1 <- read.table(text="R1
 1   A
 2   B
 3   C
 4   D
 5   E
 6   F
 7   G
 8   H", header=TRUE)
 df2<- read.table(text=" R1  R2
 1   A   2
 2   B   5
 3   D   7
 4   E   9
 5   F   12
 6   J   16", header=TRUE)
 df1$R2 <- df2$R2[ match(df1$R1, df2$R1) ]
 df1
#-----------
  R1 R2
1  A  2
2  B  5
3  C NA
4  D  7
5  E  9
6  F 12
7  G NA
8  H NA
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • This is exactly what I need, thank you very much. I knew there must be a fairly simple solution, couldn't think of it myself. – user3593717 Jun 08 '14 at 06:17
2

merge does this using the all.x (or all.y) argument, to indicate to use all rows of one of the inputs:

merge(df1, df2, all.x=TRUE)
##   R1 R2
## 1  A  2
## 2  B  5
## 3  C NA
## 4  D  7
## 5  E  9
## 6  F 12
## 7  G NA
## 8  H NA
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112