You should give your demo
data frame definitely an "ID"
column as well! Then you do not have to hope that the demographics are correctly assigned to the observations, especially if the script is still changing during the work process. That may easily be done using transform
(I simply use the consecutive ID's 1:3
here in the example).
res <- lapply(list(df1, df2, df3, df4), merge, transform(demo, ID=1:3))
res
# [[1]]
# ID b c df sex age vital_sts
# 1 1 x gh z m 30 a
# 2 2 y fg x m 50 a
# 3 3 z xv y f 62 d
#
# [[2]]
# ID v hg fd sex age vital_sts
# 1 1 a yty z m 30 a
# 2 2 mm zc x m 50 a
# 3 3 xc cx y f 62 d
#
# [[3]]
# ID t j sd sex age vital_sts
# 1 1 ae ewr z m 30 a
# 2 2 yw zd x m 50 a
# 3 3 zs x y f 62 d
#
# [[4]]
# ID u k f sex age vital_sts
# 1 1 df df z m 30 a
# 2 2 y zs x m 50 a
# 3 3 z xf y f 62 d
If you have gazillions of data frames in your workspace, as it looks like, you may list by pattern using mget(ls(pattern=))
. (Or better yet, change your code to get them in a list in the first place.)
lapply(mget(ls(pat='^df\\d+')), merge, transform(demo, ID=1:3))
Edit
If I understand you correctly, according to your comment you have a large data frame DAT
from which you want to assemble smaller data frames of variable groups and merge the demo
to them. In this case I would put the variable names of these groups in a named list vgroups
. Next, lapply
over it to simultaneously subset dat
with "ID"
c
oncatenated and merge
it to demo
.
demo
still should have an "ID"
, because you don't want to trust, all rows are sorted in the same order, just consider for example sort(c(3, 10, 1, 100))
vs. sort(as.character(c(3, 10, 1, 100)))
or omitted rows for whatever reason etc.
demo <- transform(demo, ID=1:3) ## identify demo observations
vgroups <- list(g1=c("b", "c", "df"), g2=c("v", "hg", "fd"), g3=c("t", "j", "sd"),
g4=c("u", "k", "f"))
res1 <- lapply(vgroups, \(x) merge(demo, DAT[, c('ID', x)], by="ID"))
## saying by ID is even more save --^
res1
# $g1
# ID sex age vital_sts b c df
# 1 1 m 30 a x gh z
# 2 2 m 50 a y fg x
# 3 3 f 62 d z xv y
#
# $g2
# ID sex age vital_sts v hg fd
# 1 1 m 30 a a yty z
# 2 2 m 50 a mm zc x
# 3 3 f 62 d xc cx y
#
# $g3
# ID sex age vital_sts t j sd
# 1 1 m 30 a ae ewr z
# 2 2 m 50 a yw zd x
# 3 3 f 62 d zs x y
#
# $g4
# ID sex age vital_sts u k f
# 1 1 m 30 a df df z
# 2 2 m 50 a y zs x
# 3 3 f 62 d z xf y
Access individual data frames:
res1$g1
# ID sex age vital_sts b c df
# 1 1 m 30 a x gh z
# 2 2 m 50 a y fg x
# 3 3 f 62 d z xv y
If you still want the individual data frames in your environment, use list2env
:
list2env(res1)
ls()
# [1] "DAT" "demo" "res1" "vgroups"
Data:
DAT <- structure(list(ID = 1:3, b = c("x", "y", "z"), c = c("gh", "fg",
"xv"), df = c("z", "x", "y"), f = c("z", "x", "y"), fd = c("z",
"x", "y"), hg = c("yty", "zc", "cx"), j = c("ewr", "zd", "x"),
k = c("df", "zs", "xf"), sd = c("z", "x", "y"), t = c("ae",
"yw", "zs"), u = c("df", "y", "z"), v = c("a", "mm", "xc"
), x1 = c("gs", "gs", "gs"), x2 = c("cs", "cs", "cs"), x3 = c("tv",
"tv", "tv"), x4 = c("fb", "fb", "fb")), row.names = c(NA,
-3L), class = "data.frame")
demo <- data.frame(sex = c('m', 'm', 'f'), age = c('30', '50', '62'), vital_sts = c('a', 'a', 'd'))