Matching a column from one data frame to another one by unique values of other colums

Question

I have two data frames:

>old 
response    RT  sNumber blockNo
tiger       170       1       1
tornado     36        1       2
tiger       43        1       3
squire      34        2       1
tiger       48        2       2
tornado     49        2       3
tornado     45        3       1
mouse       66        3       2
tiger       75        3       3

>new
response    sNumber blockNo
tiger             1       1
tornado           1       2
squire            2       1
tiger             2       2
tornado           3       1
mouse             3       2
tiger             3       3

In new there are fewer raws. I want to copy RT column from old and perform a mapping by response column to new by keeping correct values of RT corresponding to the unique sNumber and blockNo. It should look like this:

>new2
response    RT  sNumber blockNo
tiger      170        1       1
tornado     36        1       2
squire      34        2       1
tiger       48        2       2
tornado     45        3       1
mouse       66        3       2
tiger       75        3       3

Usually for mapping I use this loop:

for(wrd in unique(old$response)){
    new$RT[new$response == wrd] <- old$RT[old$response == wrd]
    }

However, in this particular case it messes up all RT values since it adds up them successively without checking for the unique blockNo and sNumber. How should I perform the mapping of RT in a way that I have described?

Check out `mapvalues` function in the `plyr` package – mrp Jun 15 '16 at 23:00 — mrp, Jun 15 '16 at 23:00

nya · Answer 1 · 2016-06-15T23:40:06.713

3

To match values in multiple columns in two data.frames and add extra data from one to the other, you can use merge.

 merge(old, new, by = c("response", "sNumber", "blockNo"), all = FALSE)
  response sNumber blockNo  RT
1    mouse       3       2  66
2   squire       2       1  34
3    tiger       1       1 170
4    tiger       2       2  48
5    tiger       3       3  75
6  tornado       1       2  36
7  tornado       3       1  45

This checks if values in all named columns specified in the by= argument match between the data.frames. When they all match, additional columns in either data.frame are added to the merged data.

The all = FALSE argument controls whether the resulting data should contain only the rows that matched in selected columns between the two data.frames (default). If all = TRUE, the merged data.frame will contain all rows present in the data.frames.

edited Jun 15 '16 at 23:40

answered Jun 15 '16 at 23:18

nya

2,138
15
29

The all argument is set to FALSE automatically. – lmo Jun 15 '16 at 23:29
1

@lmo Yes, but I felt it needed to be specified to better address the question. – nya Jun 15 '16 at 23:31
1

Better to just explain this this in the text something like: "there is an argument all, that is set to FALSE by default. It happens that this is what we want in this situation... in other circumstances, it can be set to TRUE to retrieve all observations from both data.frames." – lmo Jun 15 '16 at 23:34
@lmo Added. Thanks for the suggestion. Constructive. – nya Jun 15 '16 at 23:41
@nya Thank you a lot! It worked – MariKo Jun 16 '16 at 11:26

Matching a column from one data frame to another one by unique values of other colums

1 Answers1