1

I am working on Ridge regression, I want to make my own function. It tried the following. It work for individual value of k but not for array for sequence of values.

dt<-longley
attach(dt)
library(MASS) 
X<-cbind(X1,X2,X3,X4,X5,X6)
X<-as.matrix(X)
Y<-as.matrix(Y)

sx<-scale(X)/sqrt(nrow(X)-1)
sy<-scale(Y)/sqrt(nrow(Y)-1)
rxx<-cor(sx)
rxy<-cor(sx,sy)

for (k in 0:1){
res<-solve(rxx+k*diag(rxx))%*%rxy
k=k+0.01
}

Need help for optimized code too.

Cœur
  • 37,241
  • 25
  • 195
  • 267
itfeature.com
  • 380
  • 4
  • 16
  • Can you clean up your example some? We don't know what the input data is, there is no `Y` but `Y` is used, `rid` is created by not used, there is no `k` mentioned in your explanation. Perhaps this will be of use: http://stackoverflow.com/q/5963269 – Bryan Hanson Aug 10 '13 at 16:10
  • @ Bryan Hanso: code updated. Longley is default data set, but have different names that I used. You can consider any data set. – itfeature.com Aug 10 '13 at 16:15
  • @itfeature.com we don't want to consider *any* data set, we want to consider yours. If you want help, you stand a better chance if we are able to run your example without having to expend any extra unnecessary effort on our behalf. Please edit it so it works. – Simon O'Hanlon Aug 10 '13 at 16:28
  • What part do you need optimized? What do you consider a bottleneck? – Roman Luštrik Aug 10 '13 at 16:40
  • 1
    `for (k in 0:1)` will use 0 on the first pass and then 1. `k = k + 0.01` will do nothing the way you have it written, since `k` comes from the `for` statement. Perhaps you want something more along the line of `for (k in seq(0, 1, by = 0.01)`? and take out the `k = k + 0.1` By the way, what you presented is script not a function. – Bryan Hanson Aug 10 '13 at 16:48

1 Answers1

1
poly.kernel <- function(v1, v2=v1, p=1) {   
    ((as.matrix(v1) %*% t(v2))+1)^p
}   

KernelRidgeReg <- function(TrainObjects,TrainLabels,TestObjects,lambda){

  X <- TrainObjects
  y <- TrainLabels                      
  kernel <- poly.kernel(X)

  design.mat <- cbind(1, kernel)

  I <- rbind(0, cbind(0, kernel))

  M <- crossprod(design.mat) + lambda*I
  #crossprod is just x times  traspose of x, just looks neater in my openion

  M.inv <- solve(M)
  #inverse of M

  k <- as.matrix(diag(poly.kernel(cbind(TrainObjects,TrainLabels))))
  #Removing diag still gives the same MSE, but will output a vector of prediction.

  Labels <- rbind(0,as.matrix(TrainLabels))

  y.hat <- t(Labels) %*% M.inv %*% rbind(0,k)

  y.true <- Y.test

  MSE <-mean((y.hat - y.true)^2) 

  return(list(MSE=MSE,y.hat=y.hat))

}

Kernel with p=1, will give you ridge regression.

Solve built-in R function sometimes return singular matrix. You may want to write your own function to avoid that.

Waqas
  • 329
  • 3
  • 16