0

I am trying to solve an equation similar to the simplified example

x / x.sum - b = 0

where x is an n-dimensional vector. Since one can multiply x with any constant without changing the equation, the solution is up to a normalization. Because of this, I try to add an n+1-th equation, such that

x.sum() - 1 = 0

My attempts to put this in code have all produced errors. This is the most recent minimal example:

import numpy as np
from scipy.optimize import fsolve

n = 1000
a = np.ones(n)
b = np.random.rand(n)

def f(x):
    return (x / (x.sum() + 1) - b, x.sum() - 1)

x =  fsolve(f, a)

print(x)

Is this possible with fsolve? What is the correct code?

Further context: The function is a simplified example. The actual function that I attempt to solve is non-linear and complicated. I can proof that a solution for x exists and that it is unique up to scaling.

eigenvector
  • 313
  • 1
  • 3
  • 12
  • From the documentation of fsolve, the output of your function needs to have the same number of entries as your input https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fsolve.html Can you share the actual problem you are trying to solve, it would help – Learning is a mess Jul 12 '23 at 12:04
  • Yes, I understand. Problem: I have an n-dimensional system of non-linear equations that has a unique solution up to scaling the solution. So, I am trying to add an additional equation, normalizing the solution x so that all entries sum to 1. I am unsure if this is possible with fsolve, and if it is, how I can adapt the code. – eigenvector Jul 12 '23 at 12:08
  • 1
    _The actual function that I attempt to solve is non-linear and complicated_ - well, you need to show that. Otherwise any answer is a complete guess. – Reinderien Jul 12 '23 at 13:11
  • @Reinderien: You are probably right. This was useful (so thank you all) but the details seem important. Here is the related question with full details for anyone interested: https://stackoverflow.com/questions/76705742/finding-a-fast-optimization-algorithm-to-solve-a-non-linear-equation-with-unique. – eigenvector Jul 17 '23 at 15:30

2 Answers2

1

I suggest you rephrase your problem as an optimization problem where constraints are naturally accounted. Here using https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html

import numpy as np
from scipy.optimize import minimize
np.random.seed(10)

n = 1000
a = np.ones(n)
b = np.random.rand(n)

a /= a.sum() #added to speed up the optimization
b /= b.sum() #added to sanity check the solution - not needed

def opt(x):
    return ((x/x.sum()-b)**2).sum() #replaced mean by sum for better scaling with n

def norm_constraint(x):
    return x.sum() - 1


x = minimize(opt, x0=a, constraints={'type': 'eq', 'fun': norm_constraint}, tol=1e-10).x # passing tol for optimization to not terminate too early

print( np.sum(x), np.max(np.abs(x-b)))
# 1.0 1.5460126668448426e-12

Edit, here is how you can force entries of x to be positive:

import numpy as np
from scipy.optimize import minimize
from functools import partial

np.random.seed(10)

n = 1000
a = np.ones(n)
b = np.random.rand(n)

a /= a.sum() #added to speed up the optimization

def opt(x):
    return ((x/x.sum()-b)**2).sum() #replaced mean by sum for better scaling with n

def norm_constraint(x):
    return x.sum() - 1

constaints=[{'type': 'eq', 'fun': norm_constraint}]
def pos_constraint_maker(x,idx):
    return x[idx]

for idx in range(n):
    constaints.append({'type': 'ineq', 'fun': partial(pos_constraint_maker, idx=idx)})
        
x = minimize(opt, x0=a, constraints=constaints, tol=1e-10).x # passing tol for optimization to not terminate too early

print( np.sum(x), np.max(np.abs(x-b)))
# 1.0 0.9536785254041191
np.max( np.abs( x[x<0])), np.sum(np.round(x, 10) < 0)
# 8.525672691589617e-14, 0
# x does have some negative entries, but their magnitude is very low (=1e-13), and rounding can alleviate that
Learning is a mess
  • 7,479
  • 7
  • 35
  • 71
  • `eq` is not as simple a constraint as it could be. `LinearConstraint` is simpler. – Reinderien Jul 12 '23 at 13:32
  • Thanks for the heads up, I was not familiar with LinearConstraints. Note that my method works for the given example. I am willing to test it against the non-linear instance of interest when possible – Learning is a mess Jul 12 '23 at 13:51
  • When I run your code, I get x = 0.001 for all entries and when I calculate b_new = x / x.sum(), the result is very different from the old b in the equation. Do you know why that is? p.s. For replicability, I use np.random.seed(10) but it does not seem to matter. – eigenvector Jul 14 '23 at 09:00
  • Moreover, for small n. In my case, n < 109, the solution for x is correct. For n >= 109, it resorts to the generic x = 0.001. What am I missing here? – eigenvector Jul 14 '23 at 09:09
  • Good points! It seems to happen since I replaced `.sum()` by `.mean()` in the objective function. I just rolled it back, please check on your side if this works. The reason you were getting a uniform vector, same as initialization` is that `mean` with a large n value caused the change of the objective function to be smaller than the termination epsilon, meaning very early termination was triggered. Another way is passing a smaller `tol` value to minimize. – Learning is a mess Jul 14 '23 at 09:19
  • Okay, I think I can reproduce that. One other question: when I take out the line b /= b.sum(), the solved x is negative and the b_new calculated as x / x.sum() is different from the original b. Do you know why that is? Strangely, the negative x persists even when I add another constraint forcing x to be positive. – eigenvector Jul 14 '23 at 09:49
  • Why do you expect `x/x.sum()` to equal b in that setting? The optimizer is being asked to "please" to objectives `x/x.sum()` to be as close to `b` as possible and `x` to sum to 1. But in the general case `b` does not sum to 1 so you cannot satisfy these two conditions simultaneously, hence I modified `b` to sum to 1, making both objectives simultaneously satisfiable and the solution easier to check. About the positivity of the entries, let me update my answer to show how it can enforced. – Learning is a mess Jul 14 '23 at 10:20
1

The first method (and probably the one most likely to perform well) is to reduce the number of degrees of freedom of your decision variables by one, and calculate the last entry to guarantee a sum of 1. Note that this will not converge for your bogus b because you haven't also normalized it to 1.

import numpy as np
from numpy.random._generator import default_rng
from scipy.optimize import minimize

n = 10  # 1000
rand = default_rng(seed=0)
b = np.random.rand(n)
b /= sum(b)

def f(xpartial: np.ndarray) -> np.ndarray:
    x = np.concatenate((xpartial, (1 - xpartial.sum(),)))
    error = x - b
    return error.dot(error)

result = minimize(
    fun=f,
    x0=np.full(n-1, fill_value=1/n),
)
assert result.success, result.message
print(result)
print('Compare:')
print(result.x)
print(b)

The second method is to set a linear constraint (do not use a generic nonlinear constraint if your constraint function is linear). This is a modification of @learning-is-a-mess's solution that, in theory, should provide marginal improvement to numerical performance.

import numpy as np
from numpy.random._generator import default_rng
from scipy.optimize import minimize, LinearConstraint

n = 10  # 1000
rand = default_rng(seed=0)
b = np.random.rand(n)
b /= sum(b)

def f(x: np.ndarray) -> np.ndarray:
    error = x - b
    return error.dot(error)

result = minimize(
    fun=f,
    x0=np.full(n, fill_value=1/n),
    constraints=LinearConstraint(A=np.ones_like(b), lb=1, ub=1),
)
assert result.success, result.message
print(result)
print('Compare:')
print(result.x)
print(b)
Reinderien
  • 11,755
  • 5
  • 49
  • 77
  • Good job on using my method without giving me any quote even though I posted it way before you. About `do not use a generic nonlinear constraint!`: care to elaborate on that? Non-linear constraints are nothing to be terrified of, depending on the simplicity/convexity of your objective function. Proof is the example I give here-above or here-below. Also, why are you solving it for `n=10` if your method is claimed to be more efficient? – Learning is a mess Jul 12 '23 at 14:04
  • That's a lot to unpack. There's no formal quote system in SO; issuing an informal quote is I guess possible but it's a common optimization pattern and what you've come up with isn't particularly novel; and given the two implementations I'd recommend my own. If you give the optimizer a generic function then it needs to infer a Jacobian estimate. With a linear constraint the Jacobian is numerically trivial. – Reinderien Jul 12 '23 at 14:10
  • Point by point: I never claimed my approached is novel and it's worth its own publication, but in this space of discussion I see it as good manner to propose your version as an improvement on mine, instead of as it has not been suggested already. That said, I agree with the marginal improvement of having an exact calculation of the Jacobian instead of a finite element approximation of it. I am trying to say that it is not required though, and that non-linear constraints can be handled gracefully, saying the opposite could be a limiting thought in the future. – Learning is a mess Jul 12 '23 at 14:20
  • @Learningisamess re. good manners, I tend to agree, and I've edited my answer. I'll only add, about _a limiting thought_ - numerical optimisation is a fickle creature, and basically always requires _some_ measure of design limitation; the more stress you can alleviate from the optimizer the better. For this same reason, though the OP has left their actual work a mystery, they should be adding bounds and an analytic Jacobian for the function itself. – Reinderien Jul 12 '23 at 18:05
  • 1
    PS. In future, if you lead with _could you please quote me_ (good manners) instead of sarcasm you'll find that you get better results. – Reinderien Jul 12 '23 at 18:12
  • 1
    Thanks for the edit, and apologies if it came off as too strong sarcasm, maybe I got heated too quickly or replied too early. Fyi I went back to my notebook to A/B test the linear vs the non-linear constraint, speed and final accuracy seem to match (amusingly to me). Agreeing on your last point, a bespoke solution should be fed all available derivatives and helpful bounds + initialization. I would also advise using pyomo over scipy minimize. – Learning is a mess Jul 12 '23 at 21:20