Best merge or join function in r

Question

I have two dataframes df1 and df2. Both have a common identifier column.

df1 has unique lines for each identifier. But has identifier values that are not in df2.

df2 has multiple lines for each identifier value.

I want to merge the two so that I preserve the number of rows of df2, but map the (repeating) relevant ID rows from df1 into df2.

Is is best to use merge or join or something else? What arguments?

Thanks :)

Try `merge` with `all=TRUE`. Read `?merge` for your options and try them out if you're not sure which are best. — Frank, Apr 19 '16 at 20:41

score 0 · Answer 1 · answered Apr 19 '16 at 20:46

Without input data, it is very hard to provide working code. But, dplyr package's join functions are super efficient to do all kinds of joins.

In this case, you can try something like:

library(dplyr)
newdf <- left_join(df1, df2)

This will keep all ID's in df1, while retaining all duplicate rows in df2 as long as the common column filed matches.

1 Answers1