0

I'm trying to solve the system of equations:

enter image description here

where a,b and c are columns from pandas dataframe. I used to work with Excel, where I run a macro with go seak in one column (residuals) by changing the values of the other columns (functions), but I don't know how to do that in Python, and I have asked here, but no answer yet.

I have tried fsolve, but the result is always my initial guess, so, I guess it needs some bounds, but fsolve doesn't allow the introduction of bounds, that's why I'm trying scipyn.NonlinearConstraint Below are the dataset and functions to solve:

import numpy as np
import pandas as pd
import scipy.optimize as opt
from scipy.optimize import NonlinearConstraint
a = np.linspace(300,400,30) 
b = np.random.randint(700,18000,30) 
c = np.random.uniform(1.4,4.0,30) 
df = pd.DataFrame({'A':a, 'B':b, 'C':c})


def func(zGuess,*Params):
    x,y,z = zGuess
    a,b,c = Params
    
    eq_1 = (((3.47-np.log10(y))**2+(np.log10(c)+1.22)**2)**0.5) - x
    eq_2 = ((a/101.32) * (101.32/b)** z) - y
    eq_3 = (0.381 * x + 0.05 * (b/101.32) -0.15) - z
    return eq_1,eq_2,eq_3


zGuess = np.array([2.6,20.2,0.92])
up_bound = (9,np.inf,1.1)
low_bound = (0,-np.inf,0)

and here I try to solve with fsolve:

df['x'],df['y'],df['z'] = zip(*df.apply(lambda x: opt.fsolve(func,zGuess,args=(x['A'],x['B'],x['C'])),1) )

and here with Nonlinearconstraint:

df['x'],df['y'],df['z'] = zip(*df.apply(lambda x: NonlinearConstraint(func(zGuess,x['A'],x['B'],x['C']),low_bound,up_bound),1) )

but with Nonlinearconstraint it gives an error:

TypeError: zip argument #1 must support iteration

I have even tried to minimize over the pandas dataframe:

def func2(zGuess,*Params):
    x,y = zGuess
    a,b,c,n = Params
    
    eq_1 = (((3.47-np.log10(y))**2+(np.log10(c)+1.22)**2)**0.5) - x
    eq_2 = ((a/101.32) * (101.32/b)** n) - y
    
    return eq_1,eq_2,eq_3



zGuess2 = np.array([2.6,20.2])
n = np.linspace(0,1.,500)
for i in n:
    df['x'],df['y'] = zip(*df.apply(lambda x: opt.fsolve(func2,zGuess2,args=(x['a'],x['b'],x['c'],i)),1))
    df['n_calc'] = df.apply(lambda x: (0.381 * x['x'] + 0.05 * (x['b']/101.32) - 0.15),1)
    res = np.abs(df['n_calc'] - i)
    if res <= 1e-5:
        n_solver = i
        df['n_solver'].append(n_solver)
    

I never thought something so simple like go seak in excel would give such difficulty to perform in python, I really need some help, please.

JCV
  • 447
  • 1
  • 5
  • 15

1 Answers1

1

I'll ignore the pandas part of your question and will only address the optimization part. Ignoring all the pandas noise, your question is basically how to solve an equation with simple bounds on the variables. This has been answered several times, see here for example.

Here are some hints:

  • Your bounds don't make sense, e.g. the logarithm is only defined for positive values, not for zero.

  • Please make yourself familiar with np.ndarrays. It will make your life much easier when working with scipy.optimize.

  • Your NonlinearConstraint doesn't make sense as well. This is an object that can be passed to scipy.optimize.minimize. It is not callable or an optimization method.

  • If you want to solve an equation with some bounds on the variables, you need to reformulate the problem as nonlinear optimization problem and solve it by means of scipy.optimize.minimize, see the above linked answer for more details. Note that I changed your func such that it returns a np.ndarray.

Following is a working version that solves your equation for each row in your dataframe. Feel free to rewrite it such that you can use it inside df.apply().

from scipy.optimize import minimize

def func(zz, *params):
    x,y,z = zz
    a,b,c = params
    
    eq_1 = (((3.47-np.log10(y))**2+(np.log10(c)+1.22)**2)**0.5) - x
    eq_2 = ((a/101.32) * (101.32/b)** z) - y
    eq_3 = (0.381 * x + 0.05 * (b/101.32) -0.15) - z
    return np.array((eq_1,eq_2,eq_3))

bounds = [(1e-6, None), (1e-6, None), (1e-6, None)]
zGuess = np.array([2.6,20.2,0.92])

# create the new columns
df[["x", "y", "z"]] = np.zeros((30, 3))

for i, row in df.iterrows():
    params = row.values[:3]
    res = minimize(lambda x: np.sum(func(x, *params)**2), zGuess, bounds=bounds)
    # res.x contains your solution
    # let's write the values into the dataframe
    for k, val in zip(['x', 'y', 'z'], res.x):
        df.loc[i, k] = val

I never thought something so simple like go seak in excel would give such difficulty to perform in python

Mixing several different concepts without reading the docs makes each task difficult in any arbitrary programming language.

joni
  • 6,840
  • 2
  • 13
  • 20
  • Thank you for answering, but this solution doesn't solve my problem. The results still (very) different from excel, so I still with the same problem from when I was using fsolve. Here is the result I get from excel (and have physical meaning and make sense) https://github.com/jc-barreto/Some_tests/blob/main/test_df.xlsm – JCV Apr 01 '22 at 14:36
  • 1
    @JenifferBarreto I edited my answer. There was indeed a slight pandas mistake. Feel free to try it again with your original data. That being said, I checked your excel results and they **don't** solve your above equation. This can easily be verifed by evaluating your function with the excel values and the corresponding params. Please be aware that the above approach is clearly superior to your excel heuristic from a mathematical point of view. So I highly recommend against using the excel values as reference point. – joni Apr 01 '22 at 15:17
  • 1
    Thank you!! now the results make sense and are quite similar to excel (when using solver, because when I use go seak the results are different). – JCV Apr 04 '22 at 09:56