0

I have 1 base file that looks as follows:

ID  x1
1   5
2   20
3   14
4   8
5   20

I have 1.000 seperate files, that each contain extra variables. 2 examples are:

ID  x2
1   45
2   85
3   42
7   52
8   41


ID  x2
10  54
12  4
4   7
5   21

What I would like to get it:

ID  x1  x2
1   5   45
2   20  85
3   14  42
4   8   7
5   20  21

I only want to add matching IDs in "mydata". I initially tried to merge all separate files into 1 single file and then merge, but the size is too large. I also tried merging in loop, but this keeps adding new variables. Any suggestions on how to merge this?

research111
  • 347
  • 5
  • 18

2 Answers2

1

Depending on how you are reading your data in the solution will be slightly different.

However you can read in large amounts of files with apply and assign them to varying levels of a list.

files.list = list()
sapply(c(1:100), function(x){files.list[[x]] = read.csv(file = paste0("file_",x,".csv"))})

then each object in your list is a data.frame and you can use this excellent answer:

merged.data.frame = Reduce(function(...) merge(..., all=T), files.list)

https://stackoverflow.com/a/8097519/4604054

Community
  • 1
  • 1
zacdav
  • 4,603
  • 2
  • 16
  • 37
  • I get the following error "Error in files.list[[x]] = read.table(file = paste0("data[ ", x, " ].txt")) : more elements supplied than there are to replace" I am using the following code: `sapply(c(1:100), function(x){files.list[[x]] = read.table(file = paste0("data[ ",x," ].txt"))})` In the list I have 642 files, each named "data[ 1 ]", where 1 ranges from 1 to 642 Any idea why? – research111 Apr 15 '16 at 21:44
  • I really can't help unless i had at least 2-3 of the files to test. This code is more of an example and should work depending on you have your data originally. – zacdav Apr 16 '16 at 03:58
0

Something along this maybe?

m1 <- merge(baseTab, tab2, by = "ID")

# Loop through datasets start here
for (tabName in tabList) {
  m2 <- merge(baseTab, tabName, by = "ID")
  m1 <- rbind(m1, m2)
}
# and ends here
m-dz
  • 2,342
  • 17
  • 29