1

The speed of the double for loops in R is very slow, when n increases. Is there any way to improve the speed from the for loops?

    set.seed(1)
    n=1000

    y=rnorm(n)
    x1=rnorm(n)
    x2=rnorm(n)

    lm.ft=function(y,x1,x2)
      lm.fit(cbind(1,x1.bar,x2.bar), y)$coef

    res=array(,dim=c(1,3,n,n))
    for(i in 1:n)
      for(j in 1:n){
       x1.bar=x1-x1[i]
       x2.bar=x2-x2[j]
       res[,,i,j]=lm.ft(y,x1.bar,x2.bar)
      }
Spacedman
  • 92,590
  • 12
  • 140
  • 224
user1690124
  • 303
  • 1
  • 2
  • 7
  • 1
    You couldn't possibly expect it to be growing less than quadratically in `n`. This example would require 3,000,000 separate calls to `lm.fit`. You should explain what you are doing, (or being asked to do) and please indicate is this is your homework problem. – IRTFM Dec 19 '13 at 17:30
  • 1
    @IShouldBuyABoat I only count 1,000,000 calls to lm.fit, but that's surely enough. OP is not realizing that over 98% of the time is spent inside `lm.fit`. – Joris Meys Dec 19 '13 at 17:40
  • Agree. I was looking at the second index to dim() and multiplying. If I could have figured out what the goal actually was, I was thinking that it might have been addressed by simple incremental adjustments to a single lm.fit. – IRTFM Dec 19 '13 at 18:56

2 Answers2

7

Just to give you a complete answer : Apart from some oddities in your code (as in using x1.bar and x2.bar inside lm.ft rather than x1 and x2), my profiling says: what the hell are you trying to achieve????

If I run this on your own code:

Rprof("profile1.out")
for(i in 1:n)
  for(j in 1:n){
    x1.bar=x1-x1[i]
    x2.bar=x2-x2[j]
    res[,,i,j]=lm.ft(y,x1.bar,x2.bar)
  }
Rprof(NULL)
summaryRprof("profile1.out")

I get the following interesting picture:

> summaryRprof("profile1.out")
$by.self
                self.time self.pct total.time total.pct
".Call"              0.96    22.86       0.96     22.86
"lm.fit"             0.92    21.90       4.08     97.14
...
"cbind"              0.22     5.24       0.22      5.24
...

$by.total
                total.time total.pct self.time self.pct
"lm.ft"               4.12     98.10      0.04     0.95
"lm.fit"              4.08     97.14      0.92    21.90
...
"cbind"               0.22      5.24      0.22     5.24
...

98% of the time you're just fitting the model. The loop isn't slow, the fact that you're trying to fit 1 million models makes you wait. You really have to rethink your question.

If that's really what you want to do, then optimizing your function would involve getting rid of the overhead of lm.fit and vectorizing the substraction. Saves about 50%.

lm.ft=function(y,x1,x2)
  .Call(stats:::C_Cdqrls, cbind(1,x1,x2), y, tol=1e-7)$coef 

x1.bar <- outer(x1,x1,`-`)
x2.bar <- outer(x2,x2,`-`)
for(i in 1:n)
  for(j in 1:n){
    res[,,i,j]=lm.ft(y,x1.bar[,i],x2.bar[,j])
  }
Joris Meys
  • 106,551
  • 31
  • 221
  • 263
  • Is there an optimizing way for coxph in the survival package? The code is coxph(Surv(t,delta)~z1+z2)$coefficient – user1690124 Dec 19 '13 at 20:12
  • This comes back to the question of what you're trying to do. If you provided more context people might give solutions that would avoid the need to fit 1 million models ... – Ben Bolker Dec 19 '13 at 21:36
  • What I mean is like the code of .Call(stats:::C_Cdqrls, cbind(1,x1,x2), y, tol=1e-7)$coef. – user1690124 Dec 20 '13 at 04:13
2

If you want to do something crazy like that, you should use Rcpp:

library(RcppEigen)
library(inline)

incl <- '
using  Eigen::LLT;
using  Eigen::Lower;
using  Eigen::Map;
using  Eigen::MatrixXd;
using  Eigen::MatrixXi;
using  Eigen::Upper;
using  Eigen::VectorXd;
using  Eigen::Vector3d;
typedef  Map<MatrixXd>  MapMatd;
typedef  Map<MatrixXi>  MapMati;
typedef  Map<VectorXd>  MapVecd;
inline MatrixXd AtA(const MatrixXd& A) {
  int n(A.cols());
  return  MatrixXd(n,n).setZero().selfadjointView<Lower>().rankUpdate(A.adjoint());
}
'

body <- '
const MapMatd        X(as<MapMatd>(XX));
const MapVecd        y(as<MapVecd>(yy));
const int            n(X.rows()), m(X.cols());   
LLT<MatrixXd>        llt; 
MatrixXd             Res(n*n,m), Xbar(n,m);
Vector3d             betahat;
for (int i = 0; i < n; ++i) {
 for (int j = 0; j < n; ++j) {
  Xbar=X;
  for (int k = 0; k < n; ++k) {
   Xbar(k,1) -= X(i,1);
   Xbar(k,2) -= X(j,2);
  };
  llt=AtA(Xbar);
  betahat =llt.solve(Xbar.adjoint() * y);
  Res.row(i*n+j) = betahat;
 };
};
return                wrap(Res);
'

crazyLm <- cxxfunction(signature(XX = "matrix", yy = "numeric"), 
                            body, "RcppEigen", incl)

set.seed(1)
n=4

y=rnorm(n)
x1=rnorm(n)
x2=rnorm(n)

lm.ft=function(y,x1,x2) lm.fit(cbind(1,x1.bar,x2.bar), y)$coef

res=array(,dim=c(3,n,n))
for(i in 1:n)
  for(j in 1:n){
    x1.bar=x1-x1[i]
    x2.bar=x2-x2[j]
    res[,i,j]=lm.ft(y,x1.bar,x2.bar)
  }

res2 <- aperm(array(t(crazyLm(cbind(1,x1,x2), y)), dim=c(3,n,n)), c(1,3,2))
all.equal(res, res2)
#[1] TRUE

system.time({
set.seed(1)
n=1000

y=rnorm(n)
x1=rnorm(n)
x2=rnorm(n)
res <- aperm(array(t(crazyLm(cbind(1,x1,x2), y)), dim=c(3,n,n)), c(1,3,2))
})

#  User      System     elapsed 
#36.130       0.033      36.158 

That allows you to fit one million models in under one minute. However, I don't see a usecase.

Roland
  • 127,288
  • 10
  • 191
  • 288