1

I have a problem while merging two databases. I have a table A with 3094 rows and a table B with 235 rows. I need to merge them as a Left outer, that is with all observations in A. I obtain this:

dim(A)[1]
#[1] 3094

dim(B)[1]
#[1] 235

AB <- merge(A,B,by="id",all.x=T)

dim(AB)[1]
#[1] 3104

I can't understand why I obtain this. There exist an aditional argument that I can use? I will be grateful with your help.

Uwe
  • 41,420
  • 11
  • 90
  • 134
fcochaux
  • 135
  • 1
  • 13
  • 7
    My guess is that there are duplicate IDs in the second data.frame. – lmo Feb 10 '17 at 15:11
  • @lmo I know what's the problem, It's something with duplicates in the begining of all process. Unfortunately, I can't provide my data and it is a specific problem. – fcochaux Feb 10 '17 at 15:48
  • That's why posible duplicate that mencion @symbolrush doesn't work (I have seen it). Thanks for your help and guidance. – fcochaux Feb 10 '17 at 15:48
  • Try `sum(duplicated(df2$id))` or whatever. This will either confirm or reject my guess. If yes, then `which(duplicated(df2$id))` will provide the indices of the dupes and `dupeData <- df2[duplicated(df2$id) | duplicated(df2$id, fromLast=TRUE),]` will provide the data.frame with the duplicates. – lmo Feb 10 '17 at 15:55
  • 1
    Possible duplicate of [How to join (merge) data frames (inner, outer, left, right)?](http://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – MLavoie Feb 10 '17 at 17:55
  • 1
    @lmo thanks, it worked good. I solve my problem. @MLavoie sorry, but I used `merge` fine and that question is about use of this function. In this case was a problem with duplicates in my data frames. – fcochaux Feb 10 '17 at 18:32

0 Answers0