1

I am trying to use the "brute" method to minimize a function of 20 variables. It is failing with a mysterious error. Here is the complete code:

import random
import numpy as np
import lmfit

def progress_update(params, iter, resid, *args, **kws):
    pass
    #print(resid)

def score(params, data = None):
    parvals = params.valuesdict()
    M = data
    X_params = []
    Y_params = []
    for i in range(M.shape[0]):
        X_params.append(parvals['x'+str(i)])
    for j in range(M.shape[1]):
        Y_params.append(parvals['y'+str(i)])
    return diff(M, X_params, Y_params)


def diff(M, X_params, Y_params):
    total = 0
    for i in range(M.shape[0]):
        for j in range(M.shape[1]):
            total += abs(M[i,j] - (X_params[i] - Y_params[j])**2)
    return total

dim = 10
random.seed(0)
M = np.empty((dim, dim))

for i in range(M.shape[0]):
    for j in range(M.shape[1]):
        M[i,j] = i*random.random()+j**2

params = lmfit.Parameters()
for i in range(M.shape[0]):
    params.add('x'+str(i), value=random.random()*10, min=0, max=10)
for j in range(M.shape[1]):
    params.add('y'+str(j), value=random.random()*10, min=0, max=10)

result = lmfit.minimize(score, params, method='brute', kws={'data': M},  iter_cb=progress_update)

However, this fails with:

ValueError: array is too big; `arr.size * arr.dtype.itemsize` is larger than the maximum possible size.

What is causing this problem?

Simd
  • 19,447
  • 42
  • 136
  • 271

1 Answers1

1

"What is causing this problem"

Math

You can't brute force a high dimensional problem because brute force methods require exponential work (time, and memory if implemented naively).

More directly, lmfit uses numpy (*) under the hood, which has a maximum size of how much data it can allocate. Your initial data structure isn't too big (10x10), it's the combinatorical table required for a brute force that's causing problems.

If you're willing to hack the implementation, you could switch to a sparse memory structure. But this doesn't solve the math problem.

On High Dimensional Optimization

Try a different minimzer, but be warned: it's very difficult to minimze globally in high dimensional space. "Local minima" methods like fixed point / gradient descent might be more productive.

I hate to be pessimistic, but high level optimization is very hard when probed generally, and I'm afraid is beyond the scope of an SO question. Here is a survey.

Practical Alternatives

Gradient descent is supported a little in sklearn but more for machine learning than general optimization; scipy actually has pretty good optimization coverage, and great documentation. I'd start there. It's possible to do gradient descent there too, but not necessary.

From scipy's docs on unconstrained minimization, you have many options:

Method Nelder-Mead uses the Simplex algorithm [], []. This algorithm is robust in many applications. However, if numerical computation of derivative can be trusted, other algorithms using the first and/or second derivatives information might be preferred for their better performance in general.

Method Powell is a modification of Powell’s method [], [] which is a conjugate direction method. It performs sequential one-dimensional minimizations along each vector of the directions set (direc field in options and info), which is updated at each iteration of the main minimization loop. The function need not be differentiable, and no derivatives are taken.

and many more derivative-based methods are available. (In general, you do better when you have derivative information available.)


Footnotes/Looking at the Source Code

(*) the actual error is thrown here, based on your numpy implementation. Quoted:

`if (npy_mul_with_overflow_intp(&nbytes, nbytes, dim)) {
    PyErr_SetString(PyExc_ValueError,
        "array is too big; `arr.size * arr.dtype.itemsize` "
        "is larger than the maximum possible size.");
    Py_DECREF(descr);
    return NULL;`
en_Knight
  • 5,301
  • 2
  • 26
  • 46
  • Thank you. So what is the maximum size it can cope with? I also posted a related question about the failings of global optimizers too. https://stackoverflow.com/questions/51621903/why-does-global-optimization-work-so-poorly – Simd Jul 31 '18 at 21:01
  • High Anush, edited a little and I'll add a bit more. That's almost certainly implementation dependent, and would go way up if you used a sparse/iterator structure under the hood. Bottom line: grid search isn't a good idea here. – en_Knight Jul 31 '18 at 21:09
  • @en_Knight is not correct in saying that lmfit "use pandas under the hood". It does not. Like Python, it should be limited only by physical memory. But en_Knight is correct that this error (from numpy) arises because you are asking for a 20-dimensional grid, each with 10 steps. That's 10**20 function evaluations. If each function evaluation takes 1 msec, the total runtime will be 3 billion years. Brute force methods work well with 2 or 3 or maybe 4 variable parameters. – M Newville Aug 03 '18 at 02:59
  • @en_Knight As a convenience, it *can* work with data in pandas.Series, which is why it import pandas. But these are converted to numpy arrays. Under the hood, it is all numpy and scipy, not pandas. Pandas is not used in any part of the actual calculation and is not relevant for the error messages here. For sure, the user's problem is because brute force methods don't scale to high dimension (# of variables) problems, but that is not the fault of pandas or numpy. Then again, the problem is parallelizable, so if you had a billion compute nodes, it would only take 3 years to solve ;). – M Newville Aug 03 '18 at 19:23
  • @MNewville Thanks again for input. I switched pandas to numpy in the answer, keeping the pointer to the actual exception. I agree that the root problem is the algorithm - I'll try to refactor the answer to make that clearer. There's two components to the lack of scalability in brute-force here: 1) the time complexity, which isn't the actual problem the user is running into (it isn't running at all, not to mention for years). 2) is the memory complexity, which *could* be solved by forcing numpy to use a sparse matrix instead of a dense one or using an iterator... – en_Knight Aug 03 '18 at 20:32
  • ... (cont'd) that wouldn't change the underlying problem that brute force is intractable, but I think it's important to mention because it's the actual error that OP is seeing – en_Knight Aug 03 '18 at 20:33