1

So I had some points in a dataframe that led me to believe I was dealing with a power law curve. After some googling, I used what I found in this post to go about curve fitting.

def func_powerlaw(x, m, c, c0):
    return c0 + x**m * c

target_func = func_powerlaw

X = np.array(selection_to_feed.selection[1:])
y = np.array(selection_to_feed.avg_feed_size[1:])

popt, pcov = curve_fit(func_powerlaw, X, y, p0 =np.asarray([-1,10**5,0]))

curvex = np.linspace(0,5000,1000)
curvey = target_func(curvex, *popt)

plt.figure(figsize=(10, 5))
plt.plot(curvex, curvey, '--')
plt.plot(X, y, 'ro')
plt.legend()
plt.show()

This is the result:

Curve

The problem is, the curve fit results in negative values for the first few values (as you can see in the blue line), and in the actual relationship, no negative Y values can exist.

A few questions:

  1. What can I do make sure no negative Y values can be output? Really, an X of 0 should have a Y value of 0 as well.
  2. Is power law curve fitting even the right thing to do? How would you describe this curve?

Thank you!

jaehak
  • 11
  • 1
  • Would you please post a link to the data? – James Phillips Jan 30 '20 at 02:34
  • You need to specify the bounds of your parameters, there is a good example of that in the documentation https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html In particular, I think that if your bonds are all in the positive range, you should not get any negative value. – Fabrizio Jan 30 '20 at 11:33
  • @Fabrizio please see my answer to this question, which discusses one of the ways to force the curve through the [0,0] point - this also prevents the negative values discussed in the question. – James Phillips Jan 30 '20 at 16:01

1 Answers1

0

If you are only looking for a simple approximating equation with a better fit, I extracted data from your plot and added the known data point [0,0] per your note. Since the uncertainty for the [0,0] point is zero - that is, you are 100% certain of that value - I used a weighted regression where that one known point was given an extremely high weight and the weight for all other points was 1. This had the effect of forcing the curve through the [0,0] point, which can be done with any software that allows weighted fitting. I found that a Standard Geometric plus offset equation, "y = a * pow(x, (b * x)) + offset", with parameters:

a = -1.0704001788540748E+02
b = -1.5095055897637395E-03
Offset =  1.0704001788540748E+02

fits as shown in the attached plot and passes through [0,0]. My suggestion is to perform a regression using this equation with the actual data plus the known [0,0] point, using these values as the initial parameter estimates - and if possible using a very large weight for the [0,0] point as I did.

plot

James Phillips
  • 4,526
  • 3
  • 13
  • 11