0

I have two dataframes, they have an overlapping set of columns but some unique columns too.

They also contain some of the same observations.

To give a concrete example:

df1 <- data.frame(Name = c("a", "b", "c", "d"), 
               age = c(1:4), 
               party = c(3:6)
)    


df2 <- data.frame(Name = c("a", "e", "c", "f"), 
              other = c(10:13), 
              party = c(3:6)
) 

Both dfs contain the observations for a and c

How would I merge the dfs to create a new df that contains all the columns, but does not repeat observations?

Abe
  • 485
  • 5
  • 17
  • Thanks for showing that this question already has an answer. One thing to consider is that many answers to this question assume some level of familiarity with database or sql-like vocabulary – Abe Jul 27 '17 at 22:03

2 Answers2

2

You can use merge() from base R.

merge(df1, df2, all=T)
# Name party age other
# 1    a     3   1    10
# 2    b     4   2    NA
# 3    c     5   3    12
# 4    d     6   4    NA
# 5    e     4  NA    11
# 6    f     6  NA    13
Eldioo
  • 522
  • 5
  • 11
1

You can use the full_join from dplyr for that:

library(dplyr)

full_join(df1, df2)

Which gives you:

  Name age party other
1    a   1     3    10
2    b   2     4    NA
3    c   3     5    12
4    d   4     6    NA
5    e  NA     4    11
6    f  NA     6    13

Hope that helps!

Dave Gruenewald
  • 5,329
  • 1
  • 23
  • 35