Lack of convergence in scipy-optimize-minimize minimization

Question

Overall, I am trying to "scale" an array such that the integral of the array is 1, i.e. the sum of array elements divided by the number of elements is 1. This scaling, however, must take place by changing the parameter alpha and not by simply multiplying the array by a scaling factor. To do this, I am using scipy-optimize-minimize. The problem is that the code runs, with the output "Optimization terminated successfully", but the current function value displayed is not 0, so clearly the optimization was not actually successful.

This is a screenshot from the paper that defines the equation.

import numpy as np
from scipy.optimize import minimize

# just defining some parameters
N = 100
g = np.ones(N)
eta = np.array([i/100 for i in range(N)])
g_at_one = 0.01   

def my_minimization_func(alpha):
    g[:] = alpha*(1+(1-g_at_one/alpha)*np.exp((eta[:]-eta[N-1])/2)*(1/np.sqrt(3)*np.sin(np.sqrt(3)/2*(eta[:] - eta[N-1])) - np.cos(np.sqrt(3)/2*(eta[:] - eta[N-1])))) 
    to_be_minimized = np.sum(g[:])/N - 1
    return to_be_minimized

result_of_minimization = minimize(my_minimization_func, 0.1, options={'gtol': 1e-8, 'disp': True})
alpha_at_min = result_of_minimization.x
print(alpha_at_min)

SuperKogito · Answer 1 · 2019-04-10T22:59:23.353

Well it is unclear to me, why are you using a minimization for such a problem? You can simply normalize the matrix and then compute alpha using the normalized matrix and the old one. For matrix normalization look here.

In your code, your objective function includes a division by zero (1-g_at_one/alpha) so the function is not defined in 0 and that is why I assume scipy is jumping it.

EDIT: So I simply reformulated your problem and used a constraint, added some prints for better visualizations. I hope this helps:

import numpy as np
from scipy.optimize import minimize

# just defining some parameters
N   = 100
g   = np.ones(N)
eta = np.array([i/100 for i in range(N)])
g_at_one = 0.01   

def f(alpha):
    g = alpha*(1+(1-g_at_one/alpha)*np.exp((eta[:]-eta[N-1])/2)*(1/np.sqrt(3)*np.sin(np.sqrt(3)/2*(eta[:] - eta[N-1])) - np.cos(np.sqrt(3)/2*(eta[:] - eta[N-1])))) 
    to_be_minimized = np.sum(g[:])/N
    print("+ For alpha: %7s => f(alpha): %7s" % ( round(alpha[0],3),
                                                  round(to_be_minimized,3) ))
    return to_be_minimized

cons = {'type': 'ineq', 'fun': lambda alpha:  f(alpha) - 1}
result_of_minimization = minimize(f, 
                                  x0 = 0.1,
                                  constraints = cons,
                                  tol = 1e-8,
                                  options = {'disp': True})

alpha_at_min = result_of_minimization.x

# verify 
print("\nAlpha at min: ", alpha_at_min[0])
alpha = alpha_at_min
g = alpha*(1+(1-g_at_one/alpha)*np.exp((eta[:]-eta[N-1])/2)*(1/np.sqrt(3)*np.sin(np.sqrt(3)/2*(eta[:] - eta[N-1])) - np.cos(np.sqrt(3)/2*(eta[:] - eta[N-1])))) 
print("Verification: ", round(np.sum(g[:])/N - 1) == 0)

Output:

+ For alpha:     0.1 => f(alpha):   0.021
+ For alpha:     0.1 => f(alpha):   0.021
+ For alpha:     0.1 => f(alpha):   0.021
+ For alpha:     0.1 => f(alpha):   0.021
+ For alpha:     0.1 => f(alpha):   0.021
+ For alpha:     0.1 => f(alpha):   0.021
+ For alpha:     0.1 => f(alpha):   0.021
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
+ For alpha:   7.962 => f(alpha):     1.0
Optimization terminated successfully.    (Exit mode 0)
            Current function value: 1.0000000000000004
            Iterations: 3
            Function evaluations: 9
            Gradient evaluations: 3

Alpha at min:  7.9620687892224264
Verification:  True

I agree that I can normalize the matrix using, for example, normalized_g = g[:] / (np.sum(g[:]/N)) but I am not sure how that helps because calculating alpha from the normalized and original matrix is still non-trivial given that alpha is inside the complicated expression — kdissel, Apr 10 '19 at 20:16
If you are looking for normalized values in `[0,1]`, your approach returns values bigger than 1. I think you are confusing a normalization with scaling of a matrix. In scaling `alpha` is a factor, and you can write `normalized_g = alpha*g` but in normalization it is not the case. Please provide a better explanation to your goal and type of normalization you need? — SuperKogito, Apr 10 '19 at 21:17
A trivial approach but I am not sure if it is correct is to divide your matrix by its norm or its max value. In that case you will have values in `[0,1]` and `alpha=1/norm(g)` or `alpha=1/max(g)` depending on which you use. — SuperKogito, Apr 10 '19 at 21:20
Hi, I edited the problem statement to make the scaling clearer. I am not looking for traditional normalization or scaling, but rather a particular scaling using the parameter alpha such that np.sum(g[:])/N = 1 — kdissel, Apr 10 '19 at 21:55
okay that I get, but can you please explain what are you doing here: ` g[:] = alpha*(1+(1-g_at_one/alpha)*np.exp((eta[:]-eta[N-1])/2)*(1/np.sqrt(3)*np.sin(np.sqrt(3)/2*(eta[:] - eta[N-1])) - np.cos(np.sqrt(3)/2*(eta[:] - eta[N-1]))))` or maybe provide the mathematical equation/ reasoning for this. — SuperKogito, Apr 10 '19 at 22:14
It comes from equation 20 of this paper: https://www.sciencedirect.com/science/article/pii/S138824810900397X — kdissel, Apr 10 '19 at 22:20
Well the paper is not accessible. I hope my code helps, even though looking at `f(x)` values, it is somehow no longer a minimization. To get the most out the answers here, I suggest including the math of your equation (a screen shot or latex) in the question as it represents the core of your objective function. Good luck! — SuperKogito, Apr 10 '19 at 23:06

Lack of convergence in scipy-optimize-minimize minimization

1 Answers1