3

have the following data frames:

id1<-c(1,2,3,4,5)
spent<-c(10,20,30,40,50)
id2<-c(1,3,4)
x<-c(1,2,2)
df1<-data.frame(id1,spent)
df2<-data.frame(id2,x)

I need to find the ids in df1 which also exist in df2 and export all their information to a new data frame (let's say df3). on this basis df3 should look as follow:

   id1     spent
   1         10
   3         30
   4         40

I would be thankful if you could help me with this problem.

AliCivil
  • 2,003
  • 6
  • 28
  • 43
  • -1 duplicate: http://stackoverflow.com/questions/9297237/merging-two-data-frames-in-r-that-have-common-and-uncommon-samples – Kay Sep 06 '12 at 14:33

2 Answers2

8

Use merge see ?merge for get information about by.x and by.y arguments

merge(df1, df2, by.x="id1", by.y="id2")[,-3] # this is the desired output you showed
  id1 spent
1   1    10
2   3    30
3   4    40

merge(df1, df2, by.x="id1", by.y="id2") # this is with "all their information"
  id1 spent x
1   1    10 1
2   3    30 2
3   4    40 2
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
6

You could use the data.table package, which might be faster than using merge if you're merging a lot of IDs. For example,

library(data.table)

dt1 <- data.table(id1, spent, key = "id1")

dt1[J(unique(df2$id2))]
#    id1 spent
# 1:   1    10
# 2:   3    30
# 3:   4    40

n.b. The unique also probably isn't necessary, but I included it in case the real data include duplicate id2s.

EDIT The J() is necessary, plus see the comment by Matthew Dowle.

BenBarnes
  • 19,114
  • 6
  • 56
  • 74
  • The `J()` *is* necessary, otherwise it would lookup by row number rather than the column contents. `J()` (or `.()` etc) is only optional on `character` input, but needed for `numeric`. Might be easier to point people to use `merge()`, but use `merge()` on `data.table` for the speed. That can maybe introduce `data.table` more easily than needing to learn `J()` etc. The `merge.data.table` method is a lot faster in v1.8.0+ than it used to be, as fast as `X[Y]` in many cases. – Matt Dowle Sep 06 '12 at 13:20