1

I am trying to implement a exponential regression function. sp stands for sympy. I use numpy and sympy. Firstly, in func_exp I tried to use np.exp but it generated an error (attribute error), so I decided to use sympy instead. Well, this is the code

import numpy as np
from numpy.linalg import matrix_rank
import scipy 
import scipy.integrate

import random 

import matplotlib.pyplot as plt
import matplotlib as mpl
from mpl_toolkits.mplot3d import Axes3D

from sympy import integrate
import sympy as sp

x, y = sp.symbols('x, y')

sp.init_printing(use_unicode=True,use_latex='mathjax')
def exponential_regression (x_data, y_data):
   def func_exp(x, a, b):
       return a*sp.exp(b*x)
   popt, pcov = scipy.optimize.curve_fit(func_exp, x_data, y_data)
   a = popt[0] # componente a, Parámetro ÓPTimo (popt).
   b = popt[1] # componente b, Parámetro ÓPTimo (popt).
   plt.figure()
   puntos = plt.plot(x_data, y_data, 'x', color='xkcd:maroon')
   curva_regresion = plt.plot(x_data, func_exp(x_data, a, b),    color='xkcd:teal')
   plt.show(puntos, curva_regresion)
   return func_exp(x, a, b)

I try to execute:

  x_data = np.arange(0, 51) # Crea un array de 0 a 50.
  y_data = np.array([0.001, 0.199, 0.394, 0.556, 0.797, 0.891, 1.171, 1.128, 1.437, 
          1.525, 1.720, 1.703, 1.895, 2.003, 2.108, 2.408, 2.424,2.537, 
          2.647, 2.740, 2.957, 2.58, 3.156, 3.051, 3.043, 3.353, 3.400, 
          3.606, 3.659, 3.671, 3.750, 3.827, 3.902, 3.976, 4.048, 4.018, 
          4.286, 4.353, 4.418, 4.382, 4.444, 4.485, 4.465, 4.600, 4.681, 
          4.737, 4.792, 4.845, 4.909, 4.919, 5.100])
  exponential_regression(x_data, y_data)

And I get:

exponential_regression(x_data, y_data)
TypeError: Cannot cast array data from dtype('O') to dtype('float64') according to the rule 'safe'

Traceback (most recent call last):

File "<ipython-input-122-ee7c243ae4b0>", line 1, in <module>
exponential_regression(x_data, y_data)

 File "/Volumes/TOSHIBA/spline.py", line 35, in exponential_regression
popt, pcov = scipy.optimize.curve_fit(func_exp, x_data, y_data)

 File "/Applications/anaconda3/lib/python3.6/site-packages/scipy/optimize/minpack.py", line 742, in curve_fit
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)

 File "/Applications/anaconda3/lib/python3.6/site-packages/scipy/optimize/minpack.py", line 387, in leastsq
gtol, maxfev, epsfcn, factor, diag)

 error: Result from function call is not a proper array of floats.

What is wrong? Thanks in advance!

alienflow
  • 400
  • 7
  • 19
  • 3
    Why are you bringing in sympy? – user2357112 Jun 05 '18 at 17:51
  • @user2357112 because when I used sympy I got np.array([np.exp(x)*x]) Traceback (most recent call last): File "", line 1, in np.array([np.exp(x)*x]) AttributeError: 'Symbol' object has no attribute 'exp' – alienflow Jun 05 '18 at 17:56
  • 1
    That problem wouldn't have occurred if you hadn't brought in sympy. – user2357112 Jun 05 '18 at 18:06
  • @user2357112 anyway, i use sympy later in my program... – alienflow Jun 05 '18 at 18:17
  • I don't think the equation you are using has the same shape as your data - make a scatterplot of the data only and you can see this. The Stirling equation is a much better fit to your data, try the equation "y = a * (exp(bx) - 1.0) / b" with fitted parameters a = 1.8641552758230268E-01 and b = -2.8402723064116495E-02 – James Phillips Jun 05 '18 at 18:48
  • @JamesPhillips well that's what the assignment tells us to do... use a*exp(b*x). I cannot change it... – alienflow Jun 05 '18 at 19:05
  • 1
    Are you sure it doesn't ask for `a * np.exp(b * x) + c`? For this you get a good fit, but how can you get a good fit for your function? It will never go through (0, 0). – Mr. T Jun 05 '18 at 19:57
  • @Mr.T defintely yes, it says that the exponential regression must follow y = a*exp(b*x)... – alienflow Jun 05 '18 at 20:16
  • 1
    Tip: Test your fit function with a real exponential data set, not the one you have. – Mr. T Jun 05 '18 at 21:29
  • @Mr. T how can I test it if doesn’t work? – alienflow Jun 06 '18 at 05:03

1 Answers1

6

Here is a minimal example for your fit function as close as possible to your code but removing all unnecessary elements. You can easily remove c to adhere to your requirements:

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

def func_exp(x, a, b, c):
        #c = 0
        return a * np.exp(b * x) + c

def exponential_regression (x_data, y_data):
    popt, pcov = curve_fit(func_exp, x_data, y_data, p0 = (-1, 0.01, 1))
    print(popt)
    puntos = plt.plot(x_data, y_data, 'x', color='xkcd:maroon', label = "data")
    curva_regresion = plt.plot(x_data, func_exp(x_data, *popt), color='xkcd:teal', label = "fit: {:.3f}, {:.3f}, {:.3f}".format(*popt))
    plt.legend()
    plt.show()
    return func_exp(x_data, *popt)

x_data = np.arange(0, 51) 
y_data = np.array([0.001, 0.199, 0.394, 0.556, 0.797, 0.891, 1.171, 1.128, 1.437, 
        1.525, 1.720, 1.703, 1.895, 2.003, 2.108, 2.408, 2.424,2.537, 
        2.647, 2.740, 2.957, 2.58, 3.156, 3.051, 3.043, 3.353, 3.400, 
        3.606, 3.659, 3.671, 3.750, 3.827, 3.902, 3.976, 4.048, 4.018, 
        4.286, 4.353, 4.418, 4.382, 4.444, 4.485, 4.465, 4.600, 4.681, 
        4.737, 4.792, 4.845, 4.909, 4.919, 5.100])
exponential_regression(x_data, y_data)

Output with c = 0:
enter image description here

Output with c != 0: enter image description here

Main changes explained:

  1. Removed sympy - it has nothing to do with the fitting procedure.
  2. The definition of the exponential fit function is placed outside exponential_regression, so it can be accessed from other parts of the script. It uses np.exp because you work with numpy arrays in scipy.
  3. Added the parameter p0 which contains the initial guesses for the parameters. Fit functions are often sensitive to this initial guess because of local extrema.
  4. Unpack variables with *popt to make it more flexible for different numbers of variables. a = popt[0], b = popt[1], etc.
  5. Removed unnecessary imports. Keep your namespace free from clutter.
Mr. T
  • 11,960
  • 10
  • 32
  • 54
  • Cool! Two questions: (1) How did you determine p0 = (-1, 0.01, 1)? (2) I thought that this data set has a hyperbolic best fit, but now, with the constant c... it turns out that the exponential is better? – alienflow Jun 06 '18 at 13:19
  • 1
    (1) Guesstimates. From (0, 0) we can conclude that a + c = 0. The shape tells us that a < 0. And b should have been set according to the shape to an initial parameter of -0.1 - but hey, it converged nonetheless. (2) Whether it is a better fit than another function is unclear. You have to compare the deviation using pcov to get a measure, how good the fit for each function is. But it surely looks well fitted. – Mr. T Jun 06 '18 at 16:28
  • yes it looks so, so that's why I started doubting.. thanks – alienflow Jun 06 '18 at 16:42