As far as I have understood OP's intentions from the many comments, he wants to update the giraffe
data frame with the name of many other data frames where runkey
matches.
This can be achieved by combining the other data frames into one data.table object treating the data frame names as data and finally updating giraffe
in a join.
Sample Data
According to the OP, giraffe
consists of 500 rows and 5 columns including runkey
and project
. project
is initialized here as character column for the subsequent join with the data frame names.
set.seed(123L) # required for reproducible data
giraffe <- data.frame(runkey = 1:500,
X2 = sample.int(99L, 500L, TRUE),
X3 = sample.int(99L, 500L, TRUE),
X4 = sample.int(99L, 500L, TRUE),
project = "",
stringsAsFactors = FALSE)
Then there are a number of data frames which contain only one column runkey
. According to the OP, runkey
is disjunct, i.e., the combined set of all runkey
does not contain any duplicates.
spine_hlfs <- data.frame(runkey = c(1L, 498L, 5L))
ir_dia <- data.frame(runkey = c(3L, 499L, 47L, 327L))
Proposed solution
# specify names of data frames
df_names <- c("spine_hlfs", "ir_dia")
# create named list of data frames
df_list <- mget(df_names)
# update on join
library(data.table)
setDT(giraffe)[rbindlist(df_list, idcol = "df.name"), on = "runkey", project := df.name][]
runkey X2 X3 X4 project
1: 1 2 44 63 spine_hlfs
2: 2 73 99 77
3: 3 43 20 18 ir_dia
4: 4 73 12 40
5: 5 2 25 96 spine_hlfs
---
496: 496 75 45 84
497: 497 24 63 43
498: 498 33 53 81 spine_hlfs
499: 499 1 33 16 ir_dia
500: 500 99 77 41
Explanation
setDT()
coerces giraffe
to data.table
. rbindlist(df_list, idcol = "df.name")
creates a combined data.table from the list of data frames, thereby filling the df.name
column with the names of the list elements:
df.name runkey
1: spine_hlfs 1
2: spine_hlfs 498
3: spine_hlfs 5
4: ir_dia 3
5: ir_dia 499
6: ir_dia 47
7: ir_dia 327
This intermediate result is joined on runkey
with giraffe
. The project
column is updated with the contents of df.name
only for matching rows.
Alternative solution
This is looping over df_names
and performs repeated joins which update giraffe
in place:
setDT(giraffe)
for (x in df_names) giraffe[get(x), on = "runkey", project := x]
giraffe[]