What is the most efficient way to add data from one dataframe to another based on a common, recurring ID in R?

Question

I have a small dataframe with about 1000 entries that have unique IDs, along with some information associated with each ID.

id, type
ID1, A
ID2, B
ID3, C
ID4, A
ID5, B

I have a second very large dataframe where the IDs from the first frame are repeated in no particular order.

id
ID1
ID1
ID3
ID1
ID3
ID3
ID5
...

Given that the second frame is very large, I would like to know the most efficient way to add the 'type' from the first data frame to a new varible in the second, like so:

id
ID1, A
ID1, A
ID3, C
ID1, A
ID3, C
ID3, C
ID5, B

Thanks in advance!

Try `merge(df2, df1, all.x = TRUE)`. Note that your 2nd data.frame comes first. — Rui Barradas, Nov 09 '21 at 20:22

score 0 · Answer 1 · answered Nov 09 '21 at 20:21

merge(datB, datA, by = "id", all.x = TRUE)
#    id type
# 1 ID1    A
# 2 ID1    A
# 3 ID1    A
# 4 ID3    C
# 5 ID3    C
# 6 ID3    C
# 7 ID5    B

(Order tends to be not-preserved; if that is needed, then I suggest there be some form of ordering column in the data.)

Data

datA <- structure(list(id = c("ID1", "ID2", "ID3", "ID4", "ID5"), type = c(" A", " B", " C", " A", " B")), class = "data.frame", row.names = c(NA, -5L))
datB <- structure(list(id = c("ID1", "ID1", "ID3", "ID1", "ID3", "ID3", "ID5")), class = "data.frame", row.names = c(NA, -7L))

What is the most efficient way to add data from one dataframe to another based on a common, recurring ID in R?

1 Answers1