1

I have these two dataframes:

df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
df2 = data.frame(CustomerId = c(2L, 4L, 7L), State = c(rep("Alabama", 2), rep("Ohio", 1))) 

I am interested in merging only those rows that are present in df1.

I tried merge(x = df1, y = df2, by = "CustomerId", all = TRUE), but it is merging 7 rows. How do I avoid merging the last row, keeping only all 6 rows of df1.

7 7 <NA> Ohio

Yamuna_dhungana
  • 653
  • 4
  • 10

1 Answers1

1

We need all.x instead of all. The 'x' and 'y' are input arguments for merge which represents the 'df1' and 'df2' respectively. If we use all, then it implies a full join that includes the rows from both the 'df1' and 'df2'. By specifying, all.x, it does a left join and if it is all.y, it does right join. Without specifying anything, it does inner join

merge(x = df1, y = df2, by = "CustomerId", all.x = TRUE)
akrun
  • 874,273
  • 37
  • 540
  • 662