I have been facing a problem recently where I believe that a multiple-output GP might be a good candidate. I am at the moment applying a single-output GP to my data and as dimensionality increases, my results keep getting worse. I have tried multiple-output with SKlearn and was able to get better results for higher dimensions, however I believe that GPy is more complete for such tasks and I would have more control over the model. For the single-output GP I was setting the kernel as the following:
kernel = GPy.kern.RBF(input_dim=4, variance=1.0, lengthscale=1.0, ARD = True)
m = GPy.models.GPRegression(X, Y_single_output, kernel = kernel, normalizer = True)
m.optimize_restarts(num_restarts=10)
In the example above X has size (20,4) and Y(20,1).
The implementation that I am using to multiple-output I got from Introduction to Multiple Output Gaussian Processes I prepare the data accordingly to the example, setting X_mult_output to size (80,2) - with the second column being the input indices - and rearranging Y to (80,1).
kernel = GPy.kern.RBF(1,lengthscale=1, ARD = True)**GPy.kern.Coregionalize(input_dim=1,output_dim=4, rank=1)
m = GPy.models.GPRegression(X_mult_output,Y_mult_output, kernel = kernel, normalizer = True)
Ok, everything seems to work so far, now I want to predict the values. The problem so is that it seems that I am not able to predict the values. From what I understood, you can just predict a single output by specifying the input index on the Y_metadata argument. As I have 4 inputs, I set an array that I want to predict as the following:
x_pred = np.array([3,2,2,4])
Then, I imagine that I have to do separately the prediction of each value out of my x_pred array as shown in Coregionalized Regression Model (vector-valued regression) :
Y_metadata1 = {'output_index': np.array([[0]])}
y1_pred = m.predict(np.array(x[0]).reshape(1,-1),Y_metadata=Y_metadata1)
The problem is that I keep getting the following error:
IndexError: index 1 is out of bounds for axis 1 with size 1
Any suggestion about how to overcome that problem or is there any mistake on my implementation?
Traceback:
Traceback (most recent call last):
File "<ipython-input-9-edb25bc29817>", line 36, in <module>
y1_pred = m.predict(np.array(x[0]).reshape(1,-1),Y_metadata=Y_metadata1)
File "c:\users\johndoe\desktop\modules\sheffieldml-gpy-v1.9.9-0-g92f2e87\sheffieldml-gpy-92f2e87\GPy\core\gp.py", line 335, in predict
mean, var = self._raw_predict(Xnew, full_cov=full_cov, kern=kern)
File "c:\users\johndoe\desktop\modules\sheffieldml-gpy-v1.9.9-0-g92f2e87\sheffieldml-gpy-92f2e87\GPy\core\gp.py", line 292, in _raw_predict
mu, var = self.posterior._raw_predict(kern=self.kern if kern is None else kern, Xnew=Xnew, pred_var=self._predictive_variable, full_cov=full_cov)
File "c:\users\johndoe\desktop\modules\sheffieldml-gpy-v1.9.9-0-g92f2e87\sheffieldml-gpy-92f2e87\GPy\inference\latent_function_inference\posterior.py", line 276, in _raw_predict
Kx = kern.K(pred_var, Xnew)
File "c:\users\johndoe\desktop\modules\sheffieldml-gpy-v1.9.9-0-g92f2e87\sheffieldml-gpy-92f2e87\GPy\kern\src\kernel_slice_operations.py", line 109, in wrap
with _Slice_wrap(self, X, X2) as s:
File "c:\users\johndoe\desktop\modules\sheffieldml-gpy-v1.9.9-0-g92f2e87\sheffieldml-gpy-92f2e87\GPy\kern\src\kernel_slice_operations.py", line 65, in __init__
self.X2 = self.k._slice_X(X2) if X2 is not None else X2
File "<decorator-gen-140>", line 2, in _slice_X
File "C:\Users\johndoe\AppData\Roaming\Python\Python37\site-packages\paramz\caching.py", line 283, in g
return cacher(*args, **kw)
File "C:\Users\johndoe\AppData\Roaming\Python\Python37\site-packages\paramz\caching.py", line 172, in __call__
return self.operation(*args, **kw)
File "c:\users\johndoe\desktop\modules\sheffieldml-gpy-v1.9.9-0-g92f2e87\sheffieldml-gpy-92f2e87\GPy\kern\src\kern.py", line 117, in _slice_X
return X[:, self._all_dims_active]
IndexError: index 1 is out of bounds for axis 1 with size 1