How can I calculate a new column for a dataframe in R based on another dataframe?

Question

I'm having trouble adding a new column to a dataframe based on values from another dataframe. I'm going to try to provide a minimal, replicable example; of course my real dataframes have lots of other columns, but I don't think they matter here.

So let's say I have two dataframes, DF1 and DF2.

ID <- c(1, 2, 3, 4, 5)
material <- c("a", "b", "c", "d", "a")
DF1 <- data.frame(ID, material)

ID <- c("a", "b", "c", "d,")
group <- c("x", "x", "y", "z")
DF2 <- data.frame(ID, group)

ID is a unique key in each dataframe. I'd like to perform an operation where I match DF1$material to DF2$ID, and add a column to DF1 that contains the associated value from DF2$group.

So in the above example the result would look like:

ID <- c(1, 2, 3, 4, 5)
material <- c("a", "b", "c", "d", "a")
group <- c("x", "x", "y", "z", "x")
DF1 <- data.frame(ID, material, group)

I'd preferably like to use tidyverse/dplyr to achieve this. I've tried inner_join and left_join, as well as mutate() with an ifelse() statement. The joins create a ton of duplicate columns, and the ifelse() produces a warning: "longer object length is not a multiple of shorter object length".

I'm sure there's an obvious solution here. How can I do this?

Does this answer your question? [How to merge two data.table by different column names?](https://stackoverflow.com/questions/29865512/how-to-merge-two-data-table-by-different-column-names) — S-SHAAF, Mar 09 '23 at 23:51

score 1 · Answer 1 · answered Mar 09 '23 at 23:06

1

DF1$newcolumn <- DF2$group[match(DF2$ID,DF1$material)]

answered Mar 09 '23 at 23:06

bkleiboeker

21
3

score 1 · Answer 2 · edited Mar 09 '23 at 23:37

1

You can also use merge() which is part of the base package of R

DF1<-merge(DF1, DF2, by.x = "material", by.y = "group")

edited Mar 09 '23 at 23:37

Ahmed Sbai

10,695
9
19
38

answered Mar 09 '23 at 23:13

William Bender

21
3

score 0 · Accepted Answer · answered Mar 09 '23 at 23:06

0

We may use the different column

library(dplyr)
left_join(DF1, DF2, by = c("material" = "ID"))
   ID material group
1  1        a     x
2  2        b     x
3  3        c     y
4  4        d     z
5  5        a     x

answered Mar 09 '23 at 23:06

akrun

874,273
37
540
662

How can I calculate a new column for a dataframe in R based on another dataframe?

3 Answers3