I am struggling with what seems to be a simple task. I have a blank dataframe with columns as shown in "A". I have a dataframe with columns as shown in B. B has fewer columns that A. I need to merge the two so file A has all the data from B. However, the file A order of columns can not change. If there is not a matching column in B, then the column in A must remain but be empty. I've tried many ways and am coming up short. The column names need to be mapped together as well. For example, "Cust No" from B needs to go to "ID" in A. "Customer Name" in B must go to "Name" in A. I could just change the names in B, then merge the two files, then use Index to rearrange the column names in A, but that did not work and is cumbersome. Any advice?
Asked
Active
Viewed 47 times
-4
-
3Please make a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) or [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) with a sample input and your expected output. This is needed to create, test and verify possible solutions. – Martin Gal Dec 19 '21 at 00:04
-
1The first step is definitely to change the column names in B to match A. So names(B)[1] <- "ID" where the "1" here refers to the first column, and do the same for the other columns so that they are the same for A and B. Then you can try the appropriate merge or join. I'm thinking C <- full_join(A, B, by = "ID") – hachiko Dec 19 '21 at 00:12
1 Answers
0
Using dplyr
from the tidyverse
you should be able to do this:
library(tidyverse)
AB <-
B %>%
rename(ID = `Cust No`, Name = `Customer Name`) %>%
full_join(A, ., by = "ID")
rename
will let you change the names in B
manually so that they match A
in the form new_name = old_name
. full_join
will keep all rows in both data frames and placing A
as the first argument and B
(represented as the .
) as the second argument will put the columns in A first. by =
ensures you're matching by the column ID
.

Andrew
- 490
- 3
- 9