0

Given two dataframes with some number of shared columns, how to I concatenate the dataframes together by automatically finding columns shared in common and appending one to the other (automatically filling in NA for the columns that aren't shared across dataframes)

If I had this, for example:

 df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
 df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1)))

df1
#  CustomerId Product
#           1 Toaster
#           2 Toaster
#           3 Toaster
#           4   Radio
#           5   Radio
#           6   Radio

df2
#  CustomerId   State
#           7 Alabama
#           8 Alabama
#           9    Ohio

magicFunction(df1, df2) would return a dataframe like this:

final_df
df1
#  CustomerId Product State
#           1 Toaster NA
#           2 Toaster NA
#           3 Toaster NA
#           4   Radio NA
#           5   Radio NA
#           6   Radio NA
#           7 NA      Alabama
#           8 NA      Alabama
#           9 NA      Ohio

I'm tried dplyr's different join varieties, but can't figure this out.

Parseltongue
  • 11,157
  • 30
  • 95
  • 160
  • 1
    You missed one of those varieties: `dplyr::bind_rows(df1, df2)`. Btw, your input data has 2 4 6 in df2, but you display 7 8 9 – Frank Oct 24 '18 at 20:49

1 Answers1

2

bind_rows should solve your problem

library(tidyverse)
df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1)))

new_df <- bind_rows(df1, df2)
Henry Cyranka
  • 2,970
  • 1
  • 16
  • 21