1

I have a question concerning the computation of a double integral in R. Maybe it is not the best software package to try numerical integration, but we are heavily relying on its stochastic optimisation packages (the function to be optimised is very non-trivial, with lots of local minima), so we cannot switch to MATLAB or other packages.

The problem is the following: it takes a whale of a time to compute the double integral using nested integrate functions, and several times more (!) using the hcubature approach from the cubature package. I tried the first solution from this answer (using hcubature from the cubature package), but it made the timing even worse; besides that, infinite integration limits are not supported, and the integration chokes for (-100, 100) interval already. With the second solution (quad2d from pracma package), the timing is great, but the computation result is way off!

The single integral is computed quite quickly (e.g., if the double integrals are commented out, it takes only 0.2 seconds to compute the value of the function, which is tolerable).

Here is a heavily simplified version of the function for the MWE (just to illustrate the point of integration).

library(cubature)
library(pracma)

# Generate some artificial data to try this function on
set.seed(100)
n <- 200
r <- rnorm(n, 0.0004, 0.01)

# Log-likelihood function accepts 3 parameters:
# [1] shape of positive shocks, [2] shape of negative shocks, [3] DoF of Student's distribution for jumps
parm <- c(6, 7, 10)

LL <- function(parm, cub = "default") {

  shapes <- parm[1:2]
  studdof <- parm[3]

  # For simplification, generate some dynamic series
  set.seed(101)
  sigmaeps <- rgamma(n, shape=shapes[1], rate=1000)
  sigmaeta <- rgamma(n, shape=shapes[2], rate=1000)
  lambdas <- rgamma(n, shape=10, rate=80)+1
  probs <- sapply(lambdas, function(x) dpois(0:2, lambda=x))
  probs <- sweep(probs, 2, colSums(probs), FUN="/") # Normalising the probabilities

  # Reserving memory for 3 series of density
  fw0 <- rep(NA, n)  
  fw1 <- rep(NA, n)
  fw2 <- rep(NA, n)

  for (t in 2:n) {
    integ0 <- function(e) { # First integrand for 0 jumps
      1/sigmaeta[t] * dgamma(-(r[t]-sigmaeps[t]*e)/sigmaeta[t], shape=shapes[2]) * # Density of negative shocks
        dgamma(e, shape=shapes[1]) # Density of positive shocks
    }

    integ1 <- function(e, g) { # Double integrand for 1 jump
      1/sigmaeta[t] * dgamma(-(r[t]-sigmaeps[t]*e-1*g)/sigmaeta[t], shape=shapes[2]) * # Density of negative shocks
        dgamma(e, shape=shapes[1]) * # Density of positive shocks
        dt(g, df = studdof)/1 # Density of jump intensity
    }

    integ2 <- function(e, g) { # Double integrand for 2 jumps
      1/sigmaeta[t] * dgamma(-(r[t]-sigmaeps[t]*e-2*g)/sigmaeta[t], shape=shapes[2]) * # Density of negative shocks
        dgamma(e, shape=shapes[1]) * # Density of positive shocks
        dt(g, df = studdof)/2 # Density of jump intensity
    }

    # Wrappers for cubature because they need vector inputs
    wrapper1 <- function(x) integ1(x[1], x[2])
    wrapper2 <- function(x) integ2(x[1], x[2])

    # Single integral that is not a problem
    fw0[t] <- integrate(integ0, 0, Inf)$value

    if (cub=="cubature") {
      # 2D CUBATURE FROM cubature PACKAGE
      fw1[t] <- hcubature(wrapper1, c(0, -20), c(20, 20))$integral
      fw2[t] <- hcubature(wrapper2, c(0, -20), c(20, 20))$integral
    } else if (cub=="prac2d") {
      # 2D CUBATURE FROM pracma PACKAGE
      fw1[t] <- quad2d(integ1, 0, 100, -100, 100)
      fw2[t] <- quad2d(integ2, 0, 100, -100, 100)
    } else if (cub=="default") {
      # DOUBLE INTEGRALS FROM BUILT-IN INTEGRATE
      fw1[t] <- integrate(function(g) { sapply(g, function(g) { integrate(function(e) integ1(e, g), 0, Inf)$value }) }, -Inf, Inf)$value
      fw2[t] <- integrate(function(g) { sapply(g, function(g) { integrate(function(e) integ2(e, g), 0, Inf)$value }) }, -Inf, Inf)$value 
    }

    if (!t%%10) print(t)
  }
  fw <- fw0*probs[1, ] + fw1*probs[2, ] + fw2*probs[3, ]
  fw <- log(fw[2:n])
  fw[is.nan(fw)] <- -Inf
  slfw <- sum(fw)
  print(paste0("Point: ", paste(formatC(parm, 4, format="e", digits=3), collapse=" "), ", LL: ", round(slfw, 2)))

  return(slfw)
}

system.time(LL(parm, cub="default"))
# 13 seconds
# "Point: 6.000e+00 7.000e+00 1.000e+01, LL: 247.78"
system.time(LL(parm, cub="cubature"))
# 29 seconds, the result is slightly off
# "Point: 6.000e+00 7.000e+00 1.000e+01, LL: 241.7"
system.time(LL(parm, cub="prac2d"))
# 0.5 seconds, the result is way off
# "Point: 6.000e+00 7.000e+00 1.000e+01, LL: 223.25"

(Ideally, integ1(e, g) and integ2(e, g) should be integrated over [0, Inf) w.r.t. e and over (-Inf, Inf) w.r.t. g.)

Parallelisation is done at a higher level (i.e., the stochastic optimiser is computing the values of this likelihood function in parallel), so it is essential that this function run as quickly as possible on a single core.

Is there any way to speed up the computation of this double integral?

  • 1
    Please use `pracma::integral2` instead of the slow `quad2d`. It is a quite modern implementation for 2-dim. integration in pure R. For your example it will return 240.16 and take about 1.5 sec. I have no idea how accurate this result is. I am afraid you will need some external procedure if this is not good enough for you. – Hans W. Mar 04 '18 at 19:53
  • @HansW. Just a short question: why are you saying that `quad2d` is slow while it is performing the best here? Also, what do you mean by “external procedure”? – Andreï V. Kostyrka Mar 04 '18 at 20:22
  • You will need to increase the number of nodes, `n = 256` or something, to make the result more accurate, and this will slow down `quad2d` considerably. With 'external' I mean you could call a Python or Julia integration routine from within R, or a specialised C library, for example. What worries me a bit is your wish to finally integrate over an infinite domain. – Hans W. Mar 04 '18 at 20:38

1 Answers1

4

Here is a wrapper for hcubature which I use to allow infinite limits:

hcubature.inf <- function() {
  cl <- match.call()
  cl[[1L]] <- quote(cubature::hcubature)
  if(all(is.finite(c(lowerLimit,upperLimit)))) return(eval.parent(cl))

  # convert limits to new coordinates to incorporate infinities                                                      
  cl[['upperLimit']] <- atan(upperLimit)
  cl[['lowerLimit']] <- atan(lowerLimit)
  # wrap the function with the coordinate transformation                                                             
  # update argument to hcubature with our function  
  f <- match.fun(f)                                                                 
  cl[['f']] <- if(!vectorInterface)
                 function(x, ...) f(tan(x), ...) / prod(cos(x))^2
               else
                 function(x, ...) f(tan(x), ...) / rep(apply(cos(x), 2, prod)^2, each=fDim)
  eval.parent(cl)
}
formals(hcubature.inf) <- formals(cubature::hcubature)

Then you should vectorize the integrands:

vwrapper1 <- function(x) as.matrix(integ1(x[1,], x[2,]))
vwrapper2 <- function(x) as.matrix(integ2(x[1,], x[2,]))

And integrate:

if (cub=="cubature.inf") {
  fw1[t] <- hcubature.inf(vwrapper1, c(0, -Inf), c(Inf, Inf), vectorInterface=TRUE)$integral
  fw2[t] <- hcubature.inf(vwrapper2, c(0, -Inf), c(Inf, Inf), vectorInterface=TRUE)$integral
} else if (cub=="cubature") {
 ...

You get a value of 242.83 in about half the time of your default method.

Simen Gaure
  • 541
  • 2
  • 5