Why is my Rcpp code is much slower than glmnet's?

Question

I edited the lasso code from this site to use it for multiple lambda values. I used lassoshooting package for one lambda value (this package works for one lambda value) and glmnet for multiple lambda values for comparison.

The coefficient estimates are different and this is expected because of standardization and scaling back to original scale. This is out of scope and not important here.

For one parameter case, lassoshooting is 1.5 times faster.

Both methods used all 100 lambda values in my code for multiple lambda case. But glmnet is 7.5 times faster than my cpp code. Of course, I expected that glmnet was faster, but this amount seems too much. Is it normal or is my code wrong?

EDIT

I also attached lshoot function which calculates coefficient path in an R loop. This outperforms my cpp code too.

Can I improve my cpp code?

C++ code:

// [[Rcpp::depends(RcppArmadillo)]]

#include <RcppArmadillo.h>
using namespace Rcpp;
using namespace arma;


// [[Rcpp::export]]
vec softmax_cpp(const vec & x, const vec & y) {
  return sign(x) % max(abs(x) - y, zeros(x.n_elem));
}

// [[Rcpp::export]]
mat lasso(const mat & X, const vec & y, const vec & lambda,
           const double tol = 1e-7, const int max_iter = 10000){
  int p = X.n_cols; int lam = lambda.n_elem;
  mat XX = X.t() * X;
  vec Xy = X.t() * y;
  vec Xy2 = 2 * Xy;
  mat XX2 = 2 * XX;
  mat betas = zeros(p, lam); // to store the betas

  vec beta = zeros(p); // initial beta for each lambda

  bool converged = false;
  int iteration = 0;
  vec beta_prev, aj, cj;

  for(int l = 0; l < lam; l++){
    while (!converged && (iteration < max_iter)){

      beta_prev = beta;

      for (int j = 0; j < p; j++){
        aj = XX2(j,j);
        cj = Xy2(j) - dot(XX2.row(j), beta) + beta(j) * XX2(j,j);
        beta(j) = as_scalar(softmax_cpp(cj / aj, as_scalar(lambda(l)) / aj));
      }
      iteration = iteration + 1;
      converged =  norm(beta_prev - beta, 1) < tol;  
    }
    betas.col(l) = beta;
    iteration = 0;
    converged = false;
  }
  return betas;
}

R code:

library(Rcpp)
library(rbenchmark)
library(glmnet)
library(lassoshooting)

sourceCpp("LASSO.cpp")

library(ElemStatLearn)
X <- as.matrix(prostate[,-c(9,10)])
y <- as.matrix(prostate[,9])
lambda_one <- 0.1
benchmark(cpp=lasso(X,y,lambda_one),
          lassoshooting=lassoshooting(X,y,lambda_one)$coefficients,
          order="relative", replications=100)[,1:4]

################################################
lambda <- seq(0,10,len=100)

benchmark(cpp=lasso(X,y,lambda),
          glmn=coef(glmnet(X,y,lambda=lambda)),
          order="relative", replications=100)[,1:4]

    ####################################################

EDIT

lambda <- seq(0,10,len=100)

lshoot <- function(lambda){
  betas <- matrix(NA,8,100)
  for(l in 1:100){
    betas[, l] <- lassoshooting(X,y,lambda[l])$coefficients
  }
  return(betas)
}

benchmark(cpp=lasso(X,y,lambda),
          lassoshooting_loop=lshoot(lambda),
          order="relative", replications=300)[,1:4]

Results for one parameter case:

           test replications elapsed relative
2 lassoshooting          300    0.06      1.0
1           cpp          300    0.09      1.5

Results for multiple parameter case:

  test replications elapsed relative
2 glmn          300    0.70    1.000
1  cpp          300    5.24    7.486

Results for lassoshooting loop and cpp:

                test replications elapsed relative
2 lassoshooting_loop          300    4.06    1.000
1                cpp          300    6.38    1.571

Sorry for late reply. I attached a comment under the answer below. thanks — mert, Aug 07 '18 at 10:36
Using Rcpp. Rcpp generates some functions in R. When I run this function in R, it compiles the code in C++ background. — mert, Aug 07 '18 at 15:17
try https://stackoverflow.com/questions/21341106/overriding-system-defaults-for-c-compilation-flags-from-r — Alan Birtles, Aug 07 '18 at 15:30

score 4 · Answer 1 · answered Aug 07 '18 at 08:41

4

Package {glmnet} uses warm starts and special rules for discarding lots of predictors, which makes fitting the whole "regularization path" very fast.

See their paper.

answered Aug 07 '18 at 08:41

F. Privé

11,423
2
27
78

This explains the situation. (+1) I attached a new code for comparison of lassoshooting and cpp for multiple parameters case (see edits). But my cpp code is still bad. Is it normal too? Can I make edits to improve my cpp code? – mert Aug 07 '18 at 10:18
1

@mert As mentioned by another commenter, you seem to be compiling the code without optimisations. If so, no wonder it’s slow. Modern C++ compilers rely *heavily* on optimisations being enabled. – Konrad Rudolph Aug 07 '18 at 10:25
I'm sorry but I'm very new to C++ (via Rcpp). Maybe what you write is obvious but I don't know how to do it. Can you suggest me a reference to learn it? – mert Aug 07 '18 at 10:34

Why is my Rcpp code is much slower than glmnet's?

1 Answers1