I am working with the R programming language. I am trying to perform Stochastic Gradient Descent on custom defined functions.
For instance, here is an example of using Gradient Descent to optimize a custom function (using the well established "pracma" library):
# define function:
Rastrigin <- function(x)
{
return(20 + x[1]^2 + x[2]^2 - 10*(cos(2*pi*x[1]) + cos(2*pi*x[2])))
}
# run gradient descent:
library(pracma)
> steep_descent(c(1, 1), Rastrigin)
$xmin
[1] 0.9949586 0.9949586
$fmin
[1] 1.989918
$niter
[1] 3
Now, I am trying to run Stochastic Gradient Descent on this same function. I found the following package that allow for Stochastic Gradient Descent (e.g. https://www.rdocumentation.org/packages/sgd/versions/1.1.1, https://rdrr.io/cran/torch/man/optim_rmsprop.html) - but this seems to more suited for functions within pre-existing statistical and machine learning models. I also tried looking for popular variants of Stochastic Gradient Descent such as ADAGRAD or RMSPROP, but there does not seem to be any straightforward methods to implement Stochastic Gradient Descent on custom defined functions.
For instance - suppose I wanted to run Stochastic Gradient Descent on the "Rastrigin" function that I defined above; how to do this?
Thanks!
Note: I understand that performing Gradient Descent on a function requires knowledge of the function's derivatives. From the this Stackoverflow post (Explicit formula versus symbolic derivatives in R), we can obtain the derivatives of the Rastrign Function:
#load libraries
library(Ryacas0)
library(Ryacas)
#define Rastrign function (here I am defining the function in "x" and "y" instead of "x[1] and x[2]"
z <- 20 + x^2 + y^2 - 10*(cos(2*pi*x) + cos(2*pi*y))
x <- Sym("x")
y <- Sym("y")
#first derivative with respect to x (note : 2 * pi = 6.283)
dx <- deriv(z, x, 1)
dx
yacas_expression(2 * x - -62.83185307 * sin(6.28318530717959 * x))
#first derivative with respect to y
dy <- deriv(z, y, 1)
dy
yacas_expression(2 * y - -62.83185307 * sin(6.28318530717959 * y))
Now that we know the first derivatives of the Rastrign Function with respect to "x" and "y" - can we write a function that performs Stochastic Gradient Descent on the Rastrign Function in R?