3

I am trying to fit some data to a non-linear model with two independent variables, but the length of vectors for the two independent variables are, that is xdat is smaller than ydat.

This is closely related to this question: Python curve_fit with multiple independent variables, but the requirement that xdat and ydat are different sizes seems to break things.

Let's take the example solution of xnx, but change the length of one of the arrays:

import numpy as np
from scipy.optimize import curve_fit

def func(X, a, b, c):
    x,y = X
    return np.log(a) + b*np.log(x) + c*np.log(y)

# some artificially noisy data to fit
x = np.linspace(0.1,1.1,101)
y = np.linspace(1.,2., 90) #I have changed the length of one of these arrays
a, b, c = 10., 4., 6.
z = func((x,y), a, b, c) * 1 + np.random.random(101) / 100

# initial guesses for a,b,c:
p0 = 8., 2., 7.
print curve_fit(func, (x,y), z, p0)

if you do this, then you end up with the error:

ValueError: operands could not be broadcast together with shapes (101,) (90,)

Is there a way to force curve fit to take arrays of different lengths?

Jiles
  • 199
  • 11
  • I don't understand why you want to do this. They must have the same dimensions, otherwise you are missing some data to evaluate your function. You have to specify how to handle these different dimensions. – CodeZero Jan 23 '18 at 15:52
  • Hmmm... Perhaps the example is not so representative. I am trying to fit some experimentally measured spectra, in this case `x` is the frequency of the spectra, `y` is the temperature at which the measurement has been made, and `z` is array of intensities. The function I am fitting takes both x and y as inputs, so `z` is just an `len(x)` by `len(y)` array, unless I have missed something? – Jiles Jan 23 '18 at 15:57
  • For each measured value of z you have corresponding values for x and y. So they should have the same dimensions. The dimension is just the number of samples. – CodeZero Jan 23 '18 at 16:13
  • Is this not what the code above generates? – Jiles Jan 23 '18 at 16:38

2 Answers2

5

There are two problems, the first one is, that your function has to return a 1d-array in order to be used by curve_fit. You can use ravel() from numpy to achieve that. To get the original shape back, you can use reshape(xdim, ydim).

The other thing is the dimensions of your independent variables. You have to generate a complete grid, not only two vectors. You can use meshgrid() to do this.

import numpy as np
from scipy.optimize import curve_fit

def func(X, a, b, c):
    x,y = X
    result = np.log(a) + b*np.log(x) + c*np.log(y)
    return result.ravel()

xdim = 101
ydim = 90    

x = np.linspace(0.1,1.1,xdim)
y = np.linspace(1.,2., ydim)
X=np.meshgrid(x,y)
a, b, c = 10., 4., 6.
z = func(X, a, b, c) * 1 + np.random.random(xdim*ydim) / 100

p0 = 8., 2., 7.
print(curve_fit(func, X, z, p0))

This results in a=10.05005705, b=4.00004791, c=6.00011176.

CodeZero
  • 1,649
  • 11
  • 18
2

You might find lmfit (https://lmfit.github.io/lmfit-py/) helpful for this. It has a different take on curve fitting from curve_fit, but among many improvements, it does support multiple independent variables, and they do not need to be in the first argument position (that is the default but it can be changed) or arrays that are the same length as the data.

For general minimization problems, there is no concept of "independent variable". There are the variable parameters, and the residual calculated from those. The fact that one might use extra information like data(!), or uncertainties, or independent variables, or optional switches that might be used in the calculation of the residual is completely unimportant to the minimization routine. So, multiple "independent variables", some of which might be arrays of the same length as the data or might be booleans, or dictionaries, or other custom objects should not be a conceptual problem, and should be allowed.

Lmfit does allow all of these. By default, function arguments that are positional or keyword arguments with numerical default values are assumed to be Parameters, except for those explicitly called independent variables. But you can override these defaults.

M Newville
  • 7,486
  • 2
  • 16
  • 29