23

I have been doing some fitting in python using numpy (which uses least squares).

I was wondering if there was a way of getting it to fit data while forcing it through some fixed points? If not is there another library in python (or another language i can link to - eg c)?

NOTE I know it's possible to force through one fixed point by moving it to the origin and forcing the constant term to zero, as is noted here, but was wondering more generally for 2 or more fixed points :

http://www.physicsforums.com/showthread.php?t=523360

JPH
  • 1,224
  • 2
  • 14
  • 21
  • Not sure that interpolation will help here - if my polynomial model doesn't go through the right points no amount of interpolation will make it. – JPH Mar 03 '13 at 21:38
  • 2
    By fixed points you mean both x and y fixed, right? You can use http://en.wikipedia.org/wiki/Lagrange_polynomial to interpolate while fixing those points. – dranxo Mar 03 '13 at 21:47
  • 1
    Thanks... that looks interesting. For the moment I've done a work around where I add the fixed points to the data, and weight them loads more than the rest.... Seems to work ok ish... – JPH Mar 04 '13 at 00:38
  • 1
    @rcompton Lagrange polynomials are great to fit a polynomial going exactly through certain points, but how exactly do you propose using them to approximately fit other points? – Jaime Mar 04 '13 at 07:36
  • @Jaime I interpreted the question as "how do I fit data with the constraint that you pass through given points". They don't really do the same thing that your answer does. But I suppose if you're going to use Lagrange polynomials you might as well fit all the points exactly. – dranxo Mar 04 '13 at 19:00
  • @rcompton It isn't quite the same, having a parabola that goes through two fixed points and then follows as closely as possible 1000 other points, than having a 1001 degree polynomial fitted to your 1002 points... – Jaime Mar 04 '13 at 19:08
  • @JPH I added an 'exact' solution to my answer. Would it be feasible for your case? – eat Mar 18 '13 at 16:33

3 Answers3

29

The mathematically correct way of doing a fit with fixed points is to use Lagrange multipliers. Basically, you modify the objective function you want to minimize, which is normally the sum of squares of the residuals, adding an extra parameter for every fixed point. I have not succeeded in feeding a modified objective function to one of scipy's minimizers. But for a polynomial fit, you can figure out the details with pen and paper and convert your problem into the solution of a linear system of equations:

def polyfit_with_fixed_points(n, x, y, xf, yf) :
    mat = np.empty((n + 1 + len(xf),) * 2)
    vec = np.empty((n + 1 + len(xf),))
    x_n = x**np.arange(2 * n + 1)[:, None]
    yx_n = np.sum(x_n[:n + 1] * y, axis=1)
    x_n = np.sum(x_n, axis=1)
    idx = np.arange(n + 1) + np.arange(n + 1)[:, None]
    mat[:n + 1, :n + 1] = np.take(x_n, idx)
    xf_n = xf**np.arange(n + 1)[:, None]
    mat[:n + 1, n + 1:] = xf_n / 2
    mat[n + 1:, :n + 1] = xf_n.T
    mat[n + 1:, n + 1:] = 0
    vec[:n + 1] = yx_n
    vec[n + 1:] = yf
    params = np.linalg.solve(mat, vec)
    return params[:n + 1]

To test that it works, try the following, where n is the number of points, d the degree of the polynomial and f the number of fixed points:

n, d, f = 50, 8, 3
x = np.random.rand(n)
xf = np.random.rand(f)
poly = np.polynomial.Polynomial(np.random.rand(d + 1))
y = poly(x) + np.random.rand(n) - 0.5
yf = np.random.uniform(np.min(y), np.max(y), size=(f,))
params = polyfit_with_fixed_points(d, x , y, xf, yf)
poly = np.polynomial.Polynomial(params)
xx = np.linspace(0, 1, 1000)
plt.plot(x, y, 'bo')
plt.plot(xf, yf, 'ro')
plt.plot(xx, poly(xx), '-')
plt.show()

enter image description here

And of course the fitted polynomial goes exactly through the points:

>>> yf
array([ 1.03101335,  2.94879161,  2.87288739])
>>> poly(xf)
array([ 1.03101335,  2.94879161,  2.87288739])
n1k31t4
  • 2,745
  • 2
  • 24
  • 38
Jaime
  • 65,696
  • 17
  • 124
  • 159
  • 1
    If using the poly1d() constructor suggested here: https://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html, the order of the params slice is the reverse of what's expected. The simple fix is to change `return params[:n + 1]` to `return params[:n + 1][::-1]`. – ijustlovemath Jan 02 '17 at 23:30
  • Thank you for the code, it is very neat. Taking it one step further, do you think it is possible to add the condition "The total area below the polynomial has to be `A`"? where `A` is a constant. – Stefano Apr 07 '19 at 16:37
  • The logic here is over my head, but I'm wondering if there's a simple way to modify this for the L^n norm, rather than L2. I'm trying to minimize "Maximum Absolute Deviation", which is approached as n->∞, or ~=L^(big number). – odkken Jun 17 '22 at 16:47
18

If you use curve_fit(), you can use sigma argument to give every point a weight. The following example gives the first , middle, last point very small sigma, so the fitting result will be very close to these three points:

N = 20
x = np.linspace(0, 2, N)
np.random.seed(1)
noise = np.random.randn(N)*0.2
sigma =np.ones(N)
sigma[[0, N//2, -1]] = 0.01
pr = (-2, 3, 0, 1)
y = 1+3.0*x**2-2*x**3+0.3*x**4 + noise

def f(x, *p):
    return np.poly1d(p)(x)

p1, _ = optimize.curve_fit(f, x, y, (0, 0, 0, 0, 0), sigma=sigma)
p2, _ = optimize.curve_fit(f, x, y, (0, 0, 0, 0, 0))

x2 = np.linspace(0, 2, 100)
y2 = np.poly1d(p)(x2)
plot(x, y, "o")
plot(x2, f(x2, *p1), "r", label=u"fix three points")
plot(x2, f(x2, *p2), "b", label=u"no fix")
legend(loc="best")

enter image description here

HYRY
  • 94,853
  • 25
  • 187
  • 187
9

One simple and straightforward way is to utilize constrained least squares where constraints are weighted with a largish number M, like:

from numpy import dot
from numpy.linalg import solve
from numpy.polynomial.polynomial import Polynomial as P, polyvander as V

def clsq(A, b, C, d, M= 1e5):
    """A simple constrained least squared solution of Ax= b, s.t. Cx= d,
    based on the idea of weighting constraints with a largish number M."""
    return solve(dot(A.T, A)+ M* dot(C.T, C), dot(A.T, b)+ M* dot(C.T, d))

def cpf(x, y, x_c, y_c, n, M= 1e5):
    """Constrained polynomial fit based on clsq solution."""
    return P(clsq(V(x, n), y, V(x_c, n), y_c, M))

Obviously this is not really a all inclusive silver bullet solution, but apparently it seems to work reasonable well with an simple example (for M in [0, 4, 24, 124, 624, 3124]):

In []: x= linspace(-6, 6, 23)
In []: y= sin(x)+ 4e-1* rand(len(x))- 2e-1
In []: x_f, y_f= linspace(-(3./ 2)* pi, (3./ 2)* pi, 4), array([1, -1, 1, -1])
In []: n, x_s= 5, linspace(-6, 6, 123)    

In []: plot(x, y, 'bo', x_f, y_f, 'bs', x_s, sin(x_s), 'b--')
Out[]: <snip>

In []: for M in 5** (arange(6))- 1:
   ....:     plot(x_s, cpf(x, y, x_f, y_f, n, M)(x_s))
   ....: 
Out[]: <snip>

In []: ylim([-1.5, 1.5])
Out[]: <snip>
In []: show()

and producing output like: fits with progressive larger M

Edit: Added 'exact' solution:

from numpy import dot
from numpy.linalg import solve
from numpy.polynomial.polynomial import Polynomial as P, polyvander as V
from scipy.linalg import qr 

def solve_ns(A, b): return solve(dot(A.T, A), dot(A.T, b))

def clsq(A, b, C, d):
    """An 'exact' constrained least squared solution of Ax= b, s.t. Cx= d"""
    p= C.shape[0]
    Q, R= qr(C.T)
    xr, AQ= solve(R[:p].T, d), dot(A, Q)
    xaq= solve_ns(AQ[:, p:], b- dot(AQ[:, :p], xr))
    return dot(Q[:, :p], xr)+ dot(Q[:, p:], xaq)

def cpf(x, y, x_c, y_c, n):
    """Constrained polynomial fit based on clsq solution."""
    return P(clsq(V(x, n), y, V(x_c, n), y_c))

and testing the fit:

In []: x= linspace(-6, 6, 23)
In []: y= sin(x)+ 4e-1* rand(len(x))- 2e-1
In []: x_f, y_f= linspace(-(3./ 2)* pi, (3./ 2)* pi, 4), array([1, -1, 1, -1])
In []: n, x_s= 5, linspace(-6, 6, 123)
In []: p= cpf(x, y, x_f, y_f, n)
In []: p(x_f)
Out[]: array([ 1., -1.,  1., -1.])
eat
  • 7,440
  • 1
  • 19
  • 27