0

I am trying to merge 2000 files of 2MB each into a single file. Each files has 2 columns of 100000 rows. Its a simple code though, but I am getting a error that reads:

Error: cannot allocate vector of size 37.3 Gb

How do I solve this?

I am running a 64-bit R on 8GB RAM Windows 10. I have tried increasing the memory size to 50GB, but no use. Any suggestions please?

Here is my code: location is the variable that contains the path.

multmerge = function(mypath){ 
filenames=list.files(path=mypath, full.names=TRUE)
datalist = lapply(filenames, function(x){read.csv(file=x,header=T)}) 
Reduce(function(x,y) {merge(x,y)}, datalist)}
new_data<-multmerge(location)
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • what's your `memory.limit()` size? – Adam Quek May 16 '16 at 06:58
  • 1
    Does this work if you run it on two of your files? What about 20? 100? 1000? Where does it crash? How big is the output data frame getting as you scale up? Maybe you need more RAM above the 8GB that the machine has? How much? Well, you don't know until you can find out how far the analysis can go. – Spacedman May 16 '16 at 07:00
  • Also, you might consider `mclapply` instead of `lapply` for such a task. – Adam Quek May 16 '16 at 07:01
  • See [here](http://stackoverflow.com/questions/5171593), [here](http://stackoverflow.com/questions/10917532) and [here](http://stackoverflow.com/questions/11564775), simply there is not enough memory to process merge. Consider merging [outside R](http://unix.stackexchange.com/questions/113898)? – zx8754 May 16 '16 at 07:02
  • Have you considered storing the data in a SQLite database, for instance? See this related post: http://stackoverflow.com/questions/29593177/manipulation-of-large-files-in-r/29593454#29593454 – Dominic Comtois May 16 '16 at 07:02
  • 1
    The number of columns in a `merge` of a two column matrix can *double* if the column names aren't right. You could be ending up with a matrix with zillions of columns. You've not given us any of your data nor shown chunks of it nor convinced us of its correctness. How can we help? – Spacedman May 16 '16 at 07:10
  • @AdamQuek: memory.limit() is 50GB. – Krish18790 May 16 '16 at 07:50
  • @Spacedman: It worked perfect for 300 files, it doesn't work with 1500. – Krish18790 May 16 '16 at 07:51
  • Edit your post to show us the first 10 lines of three or four of your files, and explain what output you want. – Spacedman May 16 '16 at 09:38
  • Possible duplicate of [R memory management / cannot allocate vector of size n Mb](https://stackoverflow.com/questions/5171593/r-memory-management-cannot-allocate-vector-of-size-n-mb) – Benjamin May 23 '18 at 01:50

0 Answers0