Optimisation of a numerical model with several data sets (scipy.minimize / scipy.optimise, pymoo or ??)

Question

So I have a problem and I'm a little bit lost at this point. So any input would be greatly appreciated, as I'm really struggling right now!

I have a model I want to check/optimise using some experimental data I got.

Generally speaking, my model takes two inputs (let's call say: time and temperature) and has 8 variables (x0-x7). The model generates two outputs (out1 and out2).

Each set of my experimental data gives me 4 sets of information I can use for my optimisation: 2 inputs (time and temperature) and 2 experimental results (result1 and result2).

Ultimately I want to minimize the difference between result1 & out1 and result2 & out2. So basically minimizing two residuals with several sets of data which are affected by 8 parameters which they all have in common (x0-x7).

I have some bounds for the parameters x0-x7 which can help but besides that no real constraints.

So far I have tried using scipy.minimize with an iteration through my experimental result datasets like so (very schematic):

import numpy as np
from scipy.optimize import minimize
    
Experiment=[['Set 1','Set 2',
             'Set 3','Set 4'],
                   [Out 1-1,Out 1-2,
                    Out 1-3,Out 1-4],
                   [Out 2-1,Out 2-2,
                    Out 2-3,Out 2-4],
            ]
global curr_case
curr_case=0 #just for debugging in the first place
    
def objective_fcn(x):
        
    SetFitParameters(x) #x0-x7
        
    #---------probably totally dumb: iteration-----------
    global curr_case    #number of experimental set
        curr_case=curr_case+1
    if curr_case==len(Experiment):
        curr_case=0
    #----------------------------------------------------
        
    getTemp(curr_case) # function that gets time and temperature from experimental data as two arrays - time and temperature
        
    RefVariables(x) #sets some global variabales needed for ModelCal using x0-x7
        
    ModelCal(time,Temperature)  #gives Out1 and Out2
        
    f1 = abs(Out1[Upper_index-1]-Experiment[1][curr_case]) #compares Out1 with result1 (from experimental data)
    f2 = abs(Out2[Upper_index-1]-Experiment[2][curr_case]) #compares Out2 with result2 (from experimental data)
        
    # some weighting factors for the future - maybe?
    A=1
    B=1
       
    return A*f1+B*f2
       
bounds_x1=(1450,1700) #upper and lower bonds of x0
bounds_x2=(0.1,1)
bounds_x3=(1450,1700)
bounds_x4=(0.1,7)
bounds_x5=(1450,1700)
bounds_x6=(0.1,7)
bounds_x7=(1450,1700)
bounds_x8=(0.1,7)
    
bounds=[bounds_x1,bounds_x2,bounds_x3,bounds_x4,bounds_x5,bounds_x6,bounds_x7,bounds_x8]
    
x0=[1663,0.156,1523,6.37,1663,4.38,1523,2.2] #some initial guesses
    
result=minimize(objective_fcn, x0,bounds=bounds)

This pretty obviously didn't work because I just iterated through the different cases. A search on Stackoverflow has yielded some results, however, they all seem to optimse a given function, which I don't have!

The first question would be: What kind of optimisation would you recommend? Is this even close to something useful?

Second question: How do I get more than one experimental data set to be considered in my optimisation? My method of getting the inputs seems rather crude. I also tried to create two lists with the data already implemented as array elements, but also to no avail.

Lastly: As anybody who has a little bit of knowledge in optimisation can already see, I'm pretty green in this field - so I'm sorry in advance, but if anybody can point me in the right direction or can help - it would be GREATLY appreciated!

Sources I found already: -Fitting multiple data sets using scipy.optimize with the same parameters -Fit plane to a set of points in 3D: scipy.optimize.minimize vs scipy.linalg.lstsq

score 2 · Accepted Answer · edited Feb 28 '22 at 12:08

The basic idea of a shared object function is fine. I don't really go into details of the OP attempts, as this might be misleading. The process would be to define a proper residual function that can be used in a least square fit. There are several possibilities in Python to do that. I'll show scipy.optimize.leastsq and the closely related scipy.optimize.least_squares.

import numpy as np
from scipy.optimize import least_squares ## allows bounds and has given loss functions but provides only Jacobian
from scipy.optimize import leastsq ## provides scaled covariance matrix


"""
some arbitrary test function taking two inputs and providing
two correlated outputs with shared parameters - only three for testing.
"""
def test_function( time, temp, x0, x1, x2 ):
    s = np.sqrt( time/x0 ) * np.log( ( temp - x1 ) / x2 )
    t = np.exp( - time/x0 ) * np.sqrt( (time/x0)**2 + ( ( temp - x1 ) / x2 )**2 )
    return s, t

### make some data with noise
indata = list()
for _ in range( 60 ):
    a = 50 * np.random.random()
    b = 10 + 25 * np.random.random()
    indata.append( [a,b] )

outdata = list()
for a,b in indata:
    s,t = test_function( a, b, 3.78, 5.33, 12.88 )
    noise1 = np.random.normal( scale=0.01 )
    noise2 = np.random.normal( scale=0.01 )
    outdata.append( [s + noise1, t + noise2 ] )

indata = np.array( indata)
outdata = np.array( outdata)

#########################################################################
### define the residuals function for fitting This is the important part!
#########################################################################

def residuals( params, xdata, ydata, weightA=1, weightB=1 ):
    x0, x1, x2 = params
    diff = list()
    for ab, st in zip( indata, outdata ):
        a, b = ab
        s, t = st
        sf, tf = test_function( a, b, x0,x1, x2 )
        diff.append( weightA * ( s - sf ) )
        diff.append( weightB * ( t - tf ) )
    return diff

### Fit
solx, cov, info, msg, ier = leastsq( 
    residuals, [ 3.8, 5.0, 12.5],
    args=( indata, outdata ), full_output=True
)
print solx
print cov
sol = least_squares( residuals, [ 3.8, 5.0, 12.5 ], args=( indata, outdata ))
print sol.x

It should be easy to modify it to the needs of the OP.

Optimisation of a numerical model with several data sets (scipy.minimize / scipy.optimise, pymoo or ??)

1 Answers1