0

I am trying to apply a 2D curve fit a data (arbitrary) set as given below:

# Data
T   Z   X 1 X 2 X 3 X 4 X 5
100.000 1.000   1.000   1.478   1.304   1.162   0.805
200.000 1.500   2.000   2.314   2.168   2.086   1.801
300.000 2.250   3.000   3.246   3.114   3.058   2.798
400.000 3.375   4.000   4.211   4.087   4.044   3.780
500.000 5.063   5.000   5.189   5.070   5.035   4.780

The ultimate aim is to develop a correlation of the form Z = f(X, T).

At first it is curve fit using a quadratic expression Z = a * x ^ 2 + b * x + c along a constant value of T i.e. along each rows, which gives as fit parameters for each T as given below (as an example):

T   a   b   c
100.00  1.00    2.10    10.02
200.00  4.00    6.20    10.06
300.00  9.00    12.30   10.12
400.00  16.00   20.40   10.20
500.00  25.00   30.50   10.31

Now I would like to fit each fit parameters along T so that I get the equations of the form a = p * T ^ 2 + q * T + r, b = s * T ^ 2 + t * T + u, etc. I tried applying this using the code:

from __future__ import division
from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np

data = open('data.dat', "r")
line = data.readline()
while line.startswith('#'):
    line = data.readline()
data_header = line.split("\t")
data_header[-1] = data_header[-1].strip()


_data_ = np.genfromtxt('data.dat', skiprows=2, delimiter='\t', names = data_header, dtype = None, unpack = True).transpose()
data = np.array(_data_.tolist())
m = data.shape[0]
n = data.shape[1] - 2
print m, n
y_data = np.empty(shape=(m, n))
for i in range(0, m):
    for j in range(0, n):
        y_data[i, j] = (data[i, j+2])
x = _data_['X']
z = _data_['Z']

def quadratic_fit(x, a, b, c):
    return a * x ** 2 + b * x + c

fit_a = np.empty(shape = (m, 1))
fit_b = np.empty(shape = (m, 1))
fit_c = np.empty(shape = (m, 1))
z_fit = np.empty(shape=(m, len(z)))
for j in range(m):
    x_fit = y_data[j, :]
    y_fit = z
    fit_a[j], fit_b[j], fit_c[j] = optimize.curve_fit(quadratic_fit, x_fit, y_fit)[0]
fit_a_fit_a, fit_a_fit_b, fit_a_fit_c, = optimize.curve_fit(quadratic_fit, x, fit_a)[0]
fit_b_fit_a, fit_b_fit_b, fit_b_fit_c, = optimize.curve_fit(quadratic_fit, x, fit_b)[0]
fit_c_fit_a, fit_c_fit_b, fit_c_fit_c, = optimize.curve_fit(quadratic_fit, x, fit_c)[0]
fit_a = fit_a_fit_a * x ** 2 + fit_a_fit_b * x + fit_a_fit_c
fit_b = fit_b_fit_a * x ** 2 + fit_b_fit_b * x + fit_b_fit_c
fit_c = fit_c_fit_a * x ** 2 + fit_c_fit_b * x + fit_c_fit_c
for j in range(m):              
    z_fit[j, :] = (fit_a[j] * x_fit ** 2) + (fit_b[j] * x_fit) + fit_c[j] 

But it gives me the following error:

ValueError: object too deep for desired array
Traceback (most recent call last):
    fit_a_fit_a, fit_a_fit_b, fit_a_fit_c, = optimize.curve_fit(quadratic_fit, x, fit_a)[0]
  File "scipy/optimize/minpack.py", line 533, in curve_fit
    res = leastsq(func, p0, args=args, full_output=1, **kw)
  File "scipy/optimize/minpack.py", line 378, in leastsq
    gtol, maxfev, epsfcn, factor, diag)
minpack.error: Result from function call is not a proper array of floats.

How can this be done in Python?

Tom Kurushingal
  • 6,086
  • 20
  • 54
  • 86
  • the problem is in optimize.curve_fit and probably has something to do with recursion. without that function/class as a reference, not much help to provide. – Jonathan Mar 30 '15 at 20:47
  • Are you sure you want to be fitting the parameters? It looks like what you actually want is to do a 2D fit to f(x,T) = z. – kgully Mar 31 '15 at 14:57
  • Exactly, I need a 2D fit. – Tom Kurushingal Mar 31 '15 at 15:29
  • In that case, you can still use optimize.curve_fit but your 'x' data has to be contain both x and T at each point. You will need to edit your quadratic_fit function as well to have parameters for x, y, and maybe combinations of them. Check out the answer to [this question](http://stackoverflow.com/questions/7997152/python-3d-polynomial-surface-fit-order-dependent) – kgully Mar 31 '15 at 17:13
  • This is a good reference, https://github.com/erich666/GraphicsGems/blob/master/gems/FitCurves.c - I made a CPython module for it here https://github.com/ideasman42/curve-fit-nd (with some advantages over the original) – ideasman42 Sep 15 '17 at 14:48

2 Answers2

0

I just played around a bit, and I think your problem is the lines

fit_a = np.empty(shape = (m, 1))
fit_b = np.empty(shape = (m, 1))
fit_c = np.empty(shape = (m, 1))

should be

fit_a = np.empty(shape = (m, ))
fit_b = np.empty(shape = (m, ))
fit_c = np.empty(shape = (m, ))

It certainly seems like shape (m,1) should be right, but it is not treated the same as shape (m,) which is just a single-dimensional array. Try that and see if it works.

That said, I'm not sure fitting the fit parameters is the right way to do this problem, at least as far as I understand it...

kgully
  • 650
  • 7
  • 16
0

May I take this opportunity to shamelessly plug my own fitting package symfit?

I developed it precisely to make fitting problems such as yours easier. Without having run it for your problem, this is how I would solve it using symfit:

from symfit import parameters, variables, Fit

Z, X, T = variables('Z, X, T')
p, q, r, s, t, u  = parameters('p, q, r, s, t, u')

a = p * T**2 + q * T + r
b = s * T**2 + t * T + u
c = ...
model = {Z: a * X ** 2 + b * X + c}

fit = Fit(model, X=_data_['X'], T=_data_['T'], Z=_data_['Z'])
fit_result = fit.execute()

print(fit_result)

For more info, check the docs :).

tBuLi
  • 2,295
  • 2
  • 16
  • 16