2

When I run mclapply:

> ListofCSVs<- mclapply(list.files(pattern = "2013"), function(n){
    read.table(n, header=TRUE, sep = ",", stringsAsFactors = FALSE
  )}
  ,mc.cores=12) 

Where list.files(pattern = "2013") lists 12 CSV files:

> list.files(pattern = "2013")
 [1] "BONDS 2013 01.csv" "BONDS 2013 02.csv" "BONDS 2013 03.csv" "BONDS 2013 04.csv" "BONDS 2013 05.csv" "BONDS 2013 06.csv" "BONDS 2013 07.csv"
 [8] "BONDS 2013 08.csv" "BONDS 2013 09.csv" "BONDS 2013 10.csv" "BONDS 2013 11.csv" "BONDS 2013 12.csv" 

I get:

Warning message:
In mclapply(list.files(pattern = ".csv"), function(n) { :
  **scheduled core 2, 1 encountered error** in user code, all values of the job will be affected.

print(ListofCSVs[1]) ....."fatal error in wrapper code"

I have tried this , but my data is still not loading correctly.

This says that it may be a problem of too many threads....

I can load the files correctly with lapply.

I have also checked that each read.table works, hence, I do not think it is a data issue.

ai<-read.table("BONDS 2013 i.csv", header=TRUE, sep = ",", stringsAsFactors = FALSE)

Each CSV is about 1G with 40 Columns.

I also use foreach %dopar% and it works.

I have run the same code with two cores. It doesnt work.

I have run it with one core and it works.

The data is in the working directory

Thanks!

I have 16 cores and 122 GB (Amazon Cloud AWS - Linux)

UPDATE: This works...

> ListofCSVs<- parLapply(cl, list.files(pattern = ".csv"), function(n){
 read.table(n, header=TRUE, sep=',',stringsAsFactors = FALSE) 
   })

go figure....

Community
  • 1
  • 1
FG74
  • 61
  • 1
  • 6

0 Answers0