2

I have a set of N samples (N~10000 to 100000) :

(y_i, P_i)

They sample an unknown function :

y = f(P)

In my case, P is a set of n_p parameters with n_p typically around 10.

My aim is to use the samples to find the polynomial of order 2 which approximate the best my unknown function. The typical result of this would be the (n_p+1)+(n_p+1)*n_p/2 coefficients of the best-fit polynomial.

However, I would like to obtain the solution in the following form :

f(P) = (P-mu)*(C^-1)*(P-mu)^T + K

with mu a vector of dimension n_p, C a (n_p X n_p) symmetric matrix, and K a constant. These mu, C and K are what I'd like to obtain.

Is there a standard implementation somewhere in the Python ecosystem and/or an efficent way to do this ?

Thanks in advance !

Edit : Here is a typical setup, where I create fake samples (P and Y) and by only using them, I would like to recover the original mu, C and K :

import numpy as np

n_p = 3
N = 15

# Generate the "mu" we'll try to recover
mu = np.random.rand(n_p)

# Generate the "C" we'll try to recover
tmp = np.random.rand(n_p,n_p)
C = np.dot(tmp, tmp.T)

# Generate the "K" we'll try to recover
K = np.random.rand()

# Generate the samples
P = np.random.rand(n_p,N)
Y = np.array([np.dot(np.dot((P[:,i]-mu),np.linalg.inv(C)),(P[:,i]-mu))+K for i in range(N)])
S.I.
  • 21
  • 2

1 Answers1

0

Is something like this what you're looking for? I'm not 100% sure it's accurate for your use case (I don't know what additional constraints you can impose when you're fitting to an order 2 polynomial), but it should try to find a U, A (what you called C), and K that minimize the least squares error.

import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt

samples = 100
num_params = 20

y = np.random.rand(samples)
p = np.random.rand(samples, num_params)


def my_func(params):
    u = params[0:num_params]
    u = np.expand_dims(u, axis=1)
    a = params[num_params:-1]
    k = params[-1]
    a = a.reshape(num_params, num_params)
    a_inv = np.linalg.inv(a)
    shifted_p = p - np.transpose(u)
    mult_with_a_inv = np.dot(shifted_p, a_inv)
    mat_mult_vec = np.einsum('ij,ji->i', mult_with_a_inv, np.transpose(shifted_p))
    return_val = y - k - mat_mult_vec
    return sum(return_val**2)


guess = np.random.rand(num_params+num_params**2+1)
new_params = optimize.fmin_cg(my_func, guess)
new_u = new_params[0:num_params]
new_a = new_params[num_params:-1]
new_a = new_a.reshape(num_params, num_params)
new_k = new_params[-1]

original_y, = plt.plot(np.arange(samples), y)
new_y, = plt.plot(np.arange(samples), np.einsum('ij,ji->i', np.dot(p-np.transpose(new_u),
                                                                   np.linalg.inv(new_a)),
                                                            np.transpose(p-np.transpose(new_u))
                                 ) + new_k)
plt.legend([original_y, new_y], ['Original Y', 'Newly Fitted Y'])
plt.show()

There's probably a more mathematically sound approach to this, but hopefully this at least helps.

Edit: Just noticed I messed up the sign of k. I also needed to make the number of parameters higher. This seems to work super well though. I'm pretty much exactly recovering the original noise.

Also adding sample output.

Sample Output

Saedeas
  • 1,548
  • 8
  • 17
  • Nice solution ! Though I would have liked to avoid running a minimizer to fit the data, but it might be unavoidable... – S.I. Jul 29 '17 at 20:53