1

I have been trying to merge multiple data.table into one data table. All the data tables have same key.

s1.dt,s2.dt and s3.dt are three data tables with common columns o and d.

Presently, I am doing it separately for the three data tables but in the entire data set, there are total of 100 data tables like these. So I was wondering if there is R function to do the same to avoid the tedious recurring code?

Here is my code:

s1.dt= data.table(s1, key = c("o","d"))

s2.dt= data.table(s2, key = c("o","d"))

s3.dt= data.table(s3, key = c("o","d"))

system.time(s12.dt<-merge(s1.dt,s2.dt,all = T, allow.cartesian = T ))

system.time(s123.dt<-merge(s12.dt,s3.dt,all = T, allow.cartesian = T ))
Frank
  • 66,179
  • 8
  • 96
  • 180
  • 1
    Fyi: https://stackoverflow.com/questions/32526889/merge-multiple-data-tables-with-duplicate-column-names – Frank Oct 18 '18 at 17:59

1 Answers1

2

We keep the datasets in a list and use Reduce to merge all of them

Reduce(function(...) merge(..., all = TRUE, allow.cartesian = TRUE), 
                  list(s1.dt, s2.dt, s3.dt))

If there are many datasets, instead of typing object names to create the list, use either paste or ls with mget

mget(ls(pattern = "^s\\d+\\.dt$"))
akrun
  • 874,273
  • 37
  • 540
  • 662