I took Allan Cameron's solution and compared it with
a heuristic, Threshold Accepting (TA; a variant of
Simulated Annealing). Essentially, it starts with a
random submatrix and then incrementally changes this
submatrix, e.g. by exchanging row indices, or by
adding or removing a column.
A solution would be coded as a list, giving the row and column indices.
So for a matrix of size 5x5, one candidate solution might be
x
## [[1]]
## [1] TRUE FALSE FALSE TRUE FALSE
##
## [[2]]
## [1] TRUE FALSE TRUE FALSE FALSE
Such a solution is changed through a neighbourhood function, nb
.
For instance:
nb(x)
## [[1]]
## [1] TRUE FALSE FALSE TRUE TRUE
##
## [[2]]
## [1] TRUE FALSE TRUE TRUE FALSE
## ^^^^^
Given such a solution, we'll need an objective function.
OF <- function(x, M)
-det(M[x[[1]], x[[2]], drop = FALSE])
Since the implementation of TA I'll use minimises, I've put a minus in front of the determinant.
A neighbourhood funtion nb
could be this (though it could
certainly be improved):
nb <- function(x, ...) {
if (sum(x[[1L]]) > 0L &&
sum(x[[1L]]) < length(x[[1L]]) &&
runif(1) > 0.5) {
rc <- if (runif(1) > 0.5)
1 else 2
select1 <- which( x[[rc]])
select2 <- which(!x[[rc]])
size <- min(length(select1), length(select2))
size <- sample.int(size, 1)
i <- select1[sample.int(length(select1), size)]
j <- select2[sample.int(length(select2), size)]
x[[rc]][i] <- !x[[rc]][i]
x[[rc]][j] <- !x[[rc]][j]
} else {
i <- sample.int(length(x[[1L]]), 1)
if (x[[1L]][i]) {
select <- which( x[[2L]])
} else {
select <- which(!x[[2L]])
}
j <- select[sample.int(length(select), 1)]
x[[1L]][i] <- !x[[1L]][i]
x[[2L]][j] <- !x[[2L]][j]
}
x
}
Essentially, nb
flips a coin and then either rearrange row or column indices (i.e. leave the size of the submatrix unchanged), or add or remove a row and a column.
Finally, I create a helper function to create random initial solutions.
x0 <- function() {
k <- sample(n, 1)
x1 <- logical(n)
x1[sample(n, k)] <- TRUE
x2 <- sample(x1)
list(x1, x2)
}
We can run Threshold Accepting. I use an implemenation called TAopt
, provided in the NMOF
package (which I maintain). For good style, I do 10 restarts and keep the best result.
n <- 5
M <- matrix(rnorm(n*n), n, n)
max_det(M)$indices
## $rows
## [1] 1 2 4
##
## $columns
## [1] 2 3 5
library("NMOF")
restartOpt(TAopt, 10, OF,
list(x0 = x0,
neighbour = nb,
printBar = FALSE,
printDetail = FALSE,
q = 0.9,
nI = 1000, drop0 = TRUE),
M = M, best.only = TRUE)$xbest
## [[1]]
## [1] TRUE TRUE FALSE TRUE FALSE
##
## [[2]]
## [1] FALSE TRUE TRUE FALSE TRUE
So we get the same rows/columns. I ran the following small experiment, for increasing sizes of M
, from to 2 to 20. Each time I compare the solution of TA with the optimal one, and I also record the times (in seconds) that TA and complete enumeration require.
set.seed(134345)
message(format(c("Size",
"Optimum",
"TA",
"Time optimum",
"Time TA"), width = 13, justify = "right"))
for (i in 2:20) {
n <- i
M <- matrix(rnorm(n*n), n, n)
t.opt <- system.time(opt <- max_det(M)$max_determinant)
t.ta <- system.time(ta <- -restartOpt(TAopt, 10, OF,
list(x0 = x0,
neighbour = nb,
printBar = FALSE,
printDetail = FALSE,
q = 0.9,
nI = 1000, drop0 = TRUE),
M = M, best.only = TRUE)$OFvalue)
message(format(i, width = 13),
format(round(opt, 2), width = 13),
format(round(ta, 2), width = 13),
format(round(t.opt[[3]],1), width = 13),
format(round(t.ta[[3]],1), width = 13))
}
The results:
Size Optimum TA Time optimum Time TA
2 NA 1.22 0 0.7
3 1.46 1.46 0 0.6
4 2.33 2.33 0 0.7
5 11.75 11.75 0 0.7
6 9.33 9.33 0 0.7
7 9.7 9.7 0 0.7
8 126.38 126.38 0.1 0.7
9 87.5 87.5 0.3 0.7
10 198.63 198.63 1.3 0.7
11 1019.23 1019.23 5.1 0.7
12 34753.64 34753.64 20 0.7
13 16122.22 16122.22 80.2 0.7
14 168943.9 168943.9 325.3 0.7
15 274669.6 274669.6 1320.8 0.7
16 5210298 5210298 5215.4 0.7
So, at least until size 16x16, both methods return the same result. But TA needs a constant time of less than one second (iterations are fixed at 1000).