3

I have access to two clusters where R has been installed. I have been coding and testing my stuff on one, all this time. When I moved my code to the new cluster, suddenly all matrix multiplications have become very slow. Here are some numbers:

Cluster-1:
> a <- matrix(0, nrow=2000, ncol=2000)
> b <- matrix(0, nrow=2000, ncol=2000)
> system.time(c <- a %*% b)
   user  system elapsed 
   0.07    0.03    0.10

Cluster-2:
> a <- matrix(0, nrow=2000, ncol=2000)
> b <- matrix(0, nrow=2000, ncol=2000)
> system.time(c <- a%*% b)
   user  system elapsed 
 13.682   0.014  13.695

Note that I am not using any sparse matrices.

Cluster-1 uses R version 2.12.1 and Cluster-2 uses R version 2.15.0. Is there any special library that the second cluster is missing? How do I find which one? Thanks.

EDIT: Adding more details about the clusters:

Cluster-1:

> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

Cluster-2:

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.iso885915       LC_NUMERIC=C                  
 [3] LC_TIME=en_US.iso885915        LC_COLLATE=en_US.iso885915    
 [5] LC_MONETARY=en_US.iso885915    LC_MESSAGES=en_US.iso885915   
 [7] LC_PAPER=C                     LC_NAME=C                     
 [9] LC_ADDRESS=C                   LC_TELEPHONE=C                
[11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C           

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
isEmpty
  • 117
  • 2
  • 8

1 Answers1

2

You may be using a non-optimized BLAS. See here for an example: http://www.cybaea.net/Blogs/Data/Faster-R-through-better-BLAS.html

If so, it's an easy fix.

You can also try compiling and other tricks: Speed up the loop operation in R

Community
  • 1
  • 1
Ari B. Friedman
  • 71,271
  • 35
  • 175
  • 235
  • I would have thought that there is something else going on, 13sec for that operations is crazy slow even with the regular RBLAS. Could something else be running on the cluster? – Hansi Apr 13 '12 at 12:12
  • @Hansi, On my machine (a couple years old), `c <- a%*% b` takes 12 seconds. However, @isEmpty, perhaps posting the result of `sessionInfo()` from both clusters would help. – BenBarnes Apr 13 '12 at 12:16
  • I don't have root access to the cluster. So I can't install anything from the repo. Also, I saw somewhere in forums that compiling Atlas isn't a simple task. May be I should talk to the system administrator and see what can be done about it. – isEmpty Apr 13 '12 at 12:17
  • @Hansi Nothing else can be running on that cluster. I rent nodes exclusively for my use. Its not shared with anyone else. – isEmpty Apr 13 '12 at 12:26
  • @BenBarnes I have edited my question and added the sessionInfo information of both clusters. – isEmpty Apr 13 '12 at 12:26
  • @isEmpty: even without root access you can still try to build R and the optimized BLAS yourself (user-local; but of course you need a whole bunch of -dev versions of libraries). If your system runs a recent Ubuntu, installing OpenBLAS is so easy that your admin may do it if you ask really nicely: install (ubuntu) package libopenblas and enjoy. In case of Debian, have a look at Dirk's `gcbd` package (on CRAN). – cbeleites unhappy with SX Apr 13 '12 at 13:45
  • Thanks guys for your help. Since the machine runs a RocksCluster OS, I compiled GotoBLAS and installed it. It was super simple. Even though it is not as fast as ATLAS, it is faster than RBlas. – isEmpty Apr 23 '12 at 01:37