Is there a way to replace column values on matching rows while joining one data.frame to another?

Question

I'm currently working on incorporating elements of one data table with another "master" set, by some reference column. To make things clearer, I have created some sample data:

This is the dataset I am looking to join onto another "master set".

data.frame(refID = c(1,3,4,5,7,8), value = c(3.3,3.9,4.4,8.0,1.1,2.5))

refID   value
1       3.3
3       3.9
4       4.4
5       8.0
7       1.1
8       2.5

The master set:

data.frame(refID = 1:9, value = rep(0,9))

refID   value
1       0
2       0
3       0
4       0
5       0
6       0
7       0
8       0
9       0

I am basically looking to send the column of values from the first data set into the second, but where there are gaps, let them have a value of 0. Ultimately, I am hoping to get:

Resulting Set:

refID   value
1       3.3
2       0.0   
3       3.9
4       4.4
5       8.0
6       0.0
7       1.1
8       2.5
9       0.0

I've played around with some stuff in the dplyr and data.table packages but can't seem to really pinpoint a good and direct way of doing it. Advice would be greatly appreciated, many thanks.

Please provide reproducible examples. – Sven Hohenstein Mar 07 '15 at 05:40 — Sven Hohenstein, Mar 07 '15 at 05:40

score 6 · Answer 1 · edited May 23 '17 at 11:46

Using data.table, you can replace values from the first data.table on to the second by reference as follows:

require(data.table)

# data
DT1 = data.table(refID = c(1,3,4,5,7,8), value = c(3.3,3.9,4.4,8.0,1.1,2.5))
DT2 = data.table(refID = 1:9, value = 0)

setkey(DT2, refID)
DT2[DT1, value := i.value]
#    refID value
# 1:     1   3.3
# 2:     2   0.0
# 3:     3   3.9
# 4:     4   4.4
# 5:     5   8.0
# 6:     6   0.0
# 7:     7   1.1
# 8:     8   2.5
# 9:     9   0.0

Please refer to this post for explanation.

score 0 · Answer 2 · answered Mar 07 '15 at 07:04

The way I can think of would be to temporarily name the values val1 and val2, do a full_join so non-matching values are NA, then mutate a value column using the presence of the NAs. i.e. something like:

df_1 <- data.frame(refID = c(1,3,4,5,7,8), v1 = c(3.3,3.9,4.4,8.0,1.1,2.5))
df_2 <- data.frame(refID = 1:9, v2 = 0)

df_merged <- df_1 %>% 
    full_join(df_2) %>%
    mutate(value=ifelse(!is.na(v1), v1, v2) %>% # or just 0 in this case 
    select(refID, value)

Is there a way to replace column values on matching rows while joining one data.frame to another?

2 Answers2