I am trying to write a batch gradient decent function in r in to use on a training and test set of data. So far I have the below code. However, when I run it, it only prints out the last parameters and the iteration it ran. I would like to store each iteration, test error and be able to visualise the cost convergence process. I am not sure where to put or how to incorporate the code into the function below.
GradD <- function(x, y, alpha = 0.006, epsilon = 10^-10){
iter <- 0
i <- 0
x <- cbind(rep(1, nrow(x)), x)
theta <- matrix(c(1,1), ncol(x), 1)
cost <- (1/(2*nrow(x)))* t(x%*% theta - y) %*% (x %*% theta - y)
delta <- 1
while (delta > epsilon){
i <- i + 1
theta <- theta - (alpha / nrow(x)) * (t(x) %*% (x %*% theta - y))
cval <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
cost <- append(cost, cval)
delta <- abs(cost[i+1] - cost[i])
if((cost[i+1] - cost[i]) > 0){
print("The cost is increasing. Try reducing alpha.")
return()
}
iter <- append(iter, i)
}
print(sprintf("Completed in %i iterations.", i))
return(theta)
}
TPredict <- function(theta, x){
x <- cbind(rep(1,nrow(x)), x)
return(x %*% theta)
}
EDIT I have tried to create a list that holds each iteration... however now i get errors when i run the code
error.cost <- function(x, y, theta){
sum( (X %*% theta - y)^2 ) / (2*length(y))
}
num_iters <- 2000
cost_history <- double(num_iters)
theta_history <- list(num_iters)
GradD <- function(x, y, alpha = 0.006, epsilon = 10^-10){
iter <- 2000
i <- 0
x <- cbind(rep(1,nrow(x)), x)
theta <- matrix(c(1,1),ncol(x),1)
cost <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
delta <- 1
while(delta > epsilon){
i <- i + 1
theta <- theta - (alpha / nrow(x)) * (t(x) %*% (x %*% theta - y))
cval <- (1/(2*nrow(x))) * t(x %*% theta - y) %*% (x %*% theta - y)
cost <- append(cost, cval)
delta <- abs(cost[i+1] - cost[i])
cost_history[i] <- error.cost(x, y, theta)
theta_history[[i]] <- theta
if((cost[i+1] - cost[i]) > 0){
print("The cost is increasing. Try reducing alpha.")
return()
}
iter <- append(iter, i)
}
print(sprintf("Completed in %i iterations.", i))
return(theta)
}
I get error in nrow(x) %% theta: non-conformable arguments. If i remove the nrow() in this function:
error.cost <- function(x, y, theta){
sum( (x %*% theta - y)^2 ) / (2*length(y))
}
then it prints out results but they are the wrong final results and I dont have the iterations stored at all