Extracting a column from one dataset and creating another dataset with columns from a third dataset in R

Question

So I have these two datasets:

    ID      DOB         ID2   count
1   4083    2007-10-01  3625    5
2   4408    2008-07-01  3603    2
3   4514    2007-07-01  3077    3
4   4396    2008-05-01  3413    5
5   4222    2003-12-01  3341    1
6   4291    2000-07-01  3201    5
7   4581    2005-07-01  3836    1
8   4487    2007-01-01  3264    5
9   4916    2009-10-01  3825    1
10  4277    2000-04-01  3381    2

ID       DOB       score1   score2  score3  score4  score5  score6
4291    2000-07-01  2       5       2       2       1       2
4323    2000-07-01  3       3       1       4       2       5
4408    2008-07-01  4       2       5       5       3       5
4222    2003-12-01  2       1       3       2       3       3
4581    2005-07-01  5       1       5       2       3       1
4005    2003-06-01  1       4       2       4       5       3
4718    2009-02-01  2       3       1       5       5       5
4396    2008-05-01  3       5       2       2       2       5
4924    2008-02-01  5       5       4       5       5       4
4083    2007-10-01  4       5       1       3       3       4
4099    2000-05-01  4       3       1       2       1       2
4277    2000-04-01  2       2       1       3       1       1
4487    2007-01-01  2       5       2       4       3       5
4514    2007-07-01  1       3       4       3       1       5
4003    2005-07-01  3       3       4       1       1       3
4366    2008-12-01  4       4       4       4       3       4
4790    2009-07-01  1       3       1       3       1       4
4643    2002-03-01  3       2       3       3       4       3
4475    2009-05-01  1       4       3       3       3       3
4916    2009-10-01  5       1       3       1       2       2

Within dataset2 there are the ID and Dobs from dataset1, along with other rows of IDs of subjects Im not interested in. What I would like to do is to extract the IDs present in both datasets and create a dataset with the "ID2" column from dataset 1 and the other columns from dataset 2. Like so:

ID       DOB         ID2    score1  score2  score3  score4  score5  score6
4394    2004-11-01  3625    2       2       4       2       2       3
4181    2002-04-01  3603    3       1       3       2       2       5
4942    2001-08-01  3077    3       3       5       3       1       5
4765    2003-05-01  3413    1       5       5       2       3       2
4517    2003-03-01  3341    1       2       1       4       1       5
4905    2002-12-01  3201    5       2       4       1       1       5
4636    2004-07-01  3836    3       1       1       4       4       4
4179    2004-08-01  3264    5       2       5       5       4       2
4448    2007-11-01  3825    2       3       5       4       2       4
4218    2006-04-01  3381    1       5       3       4       5       3

I think the merge function comes into play here but for the life of me I cant seem to get it to work so any help you can give me will be gratefully received.

In the expected output, there is `4394` ID, but in the initial datasets, it is not there. — akrun, Jun 05 '15 at 13:27
@googleplex, I don't see ID=4394 in your first data.frame (nor in your second one) so it's kind of hard to understand the desired output... — Cath, Jun 05 '15 at 13:27
Whoops! My apologies, not sure how that happened. going to fix it now — googleplex101, Jun 05 '15 at 13:29
Looking at the help pages of `merge` (or any other function) is a good way to understand how it works as there are examples in it. Try `?merge` — akrun, Jun 05 '15 at 13:33

score 2 · Accepted Answer · answered Jun 05 '15 at 13:25

2

does this answer your need ?

 merge(df1, df2, by.x=c("ID","DOB"))

answered Jun 05 '15 at 13:25

Cath

23,906
5
52
86

1

If so, this is a a major dupe. Where did I put my hammer.... – David Arenburg Jun 05 '15 at 13:27
@DavidArenburg, I indeed think you can pull it out ;-) – Cath Jun 05 '15 at 13:28
That appears to have done the trick. Thanks! Not sure what I was doing wrong with the merge function. – googleplex101 Jun 05 '15 at 13:30
My apologies for apparently duplicating a question. Shall I delete it? – googleplex101 Jun 05 '15 at 13:31
@googleplex I think it's just gonna be marked as duplicate – Cath Jun 05 '15 at 13:31
1

@googleplex101 Please don't delete the question. It will add to the duplicate and makes it more easy to track a problem. – akrun Jun 05 '15 at 13:38

Extracting a column from one dataset and creating another dataset with columns from a third dataset in R

1 Answers1