Conditional adding of values in R with two dataframes

Question

I have two dataframes (df1, df2).

df1 <- data.frame(term = c("A", "B", "C", "D", "E", "F"))

df2 <- data.frame(term = c("C", "F", "G"), freq = c(7, 3, 5))

In df1, I want to add a column ("freq") based on the values of "freq" in df2. So if the term in df1 and the term in df2 match, the count ("freq") of this term should be added to df1. Else it should be "0" (zero).

How can I do it, so that the processing time is as small as possible? Is there a way how to do it with dplyr? I cannot figure it out!!!

akrun · Accepted Answer · 2016-05-18T08:38:11.640

1

If we need a faster option, a data.table join can be used along with assigning (:=) the NA values to 0 in place.

library(data.table)
setDT(df2)[df1, on = "term"][is.na(freq), freq := 0][]

Or to avoid copies, as @Arun mentioned, create a 'freq' column in 'df1' and then join on 'term' replace the 'freq' with the corresponding 'i.freq' values.

setDT(df1)[, freq := 0][df2, freq := i.freq, on = "term"]

Or use left_join

library(dplyr)
left_join(df1, df2, by = 'term') %>%
       mutate(freq = replace(freq, is.na(freq), 0)

edited May 18 '16 at 08:38

answered May 18 '16 at 07:09

akrun

874,273
37
540
662

1

I'd do: `setDT(df1)[, freq := 0][df2, freq := i.freq, on = "term"]` which avoids unnecessary copies. – Arun May 18 '16 at 08:36

Conditional adding of values in R with two dataframes

1 Answers1