2

I have a bunch of DF named like: df1, df2, ..., dfN

and lt1, lt2, ..., ltN

I would like to merge them in a loop, something like:

for (X in 1:N){
outputX <- merge(dfX, ltX, ...)
}

But I have some troubles getting the name of output, dfX, and ltX to change in each iteration. I realize that plyr/data.table/reshape might have an easier way, but I would like for loop to work.

Perhaps I should clarify. The DF are quite large, which is why plyr etc will not work (they crash). I would like to avoid copy'ing. The next in the code is to save the merged DF. This is why I prefer the for-loop apporach, since I know what each merged DF is named in the enviroment.

Repmat
  • 690
  • 6
  • 19
  • This would be much easier if they were in a list, e.g. a list named `lt` with N elements, each of them a data frame. [See here](http://stackoverflow.com/a/24376207/903061) for next time. – Gregor Thomas Mar 12 '15 at 22:40

2 Answers2

2

You can combine data frames into lists and use mapply, as in the example below:

i <- 1:3
d1.a <- data.frame(i=i,a=letters[i])
d1.b <- data.frame(i=i,A=LETTERS[i])

i <- 11:13
d2.a <- data.frame(i=i,a=letters[i])
d2.b <- data.frame(i=i,A=LETTERS[i])

L1 <- list(d1.a, d2.a)
L2 <- list(d1.b, d2.b)

mapply(merge,L1,L2,SIMPLIFY=F)
# [[1]]
#   i a A
# 1 1 a A
# 2 2 b B
# 3 3 c C
# 
# [[2]]
#   i a A
# 1 11 k K
# 2 12 l L
# 3 13 m M

If you'd like to save every of the resulting data frames in the global environment (I'd advise against it though), you could do:

result <- mapply(merge,L1,L2,SIMPLIFY=F)
names(result) <- paste0('output',seq_along(result))

which will give a name to every data frame in the list, an then:

sapply(names(result),function(s) assign(s,result[[s]],envir = globalenv()))

Please note that provided is a base R solution that does essentially the same thing as your sample code.

Marat Talipov
  • 13,064
  • 5
  • 34
  • 53
0

If your data frames are in a list, writing a for loop is trivial:

# lt = list(lt1, lt2, lt3, ...)
# if your data is very big, this may run you out of memory
lt = lapply(ls(pattern = "lt[0-9]*"), get)

merged_data = merge(lt[[1]], lt[[2]])


for (i in 3:length(lt)) {
    merged_data = merge(merged_data, lt[[i]])
    save(merged_data, file = paste0("merging", i, ".rda"))
}
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294