How to make prediction with GPflow - running GPC with a simple data input? Failed to run the code from example notebook on different data

Question

I tried to run the code from the notebook on self generated data, to prove if the model will do any classification. https://gpflow.readthedocs.io/en/master/notebooks/basics/classification.html

So I created X and Y as input data.

X=np.array([-0.0259,-0.3579,-0.289,0.0356,0.0147,0.0234]).reshape(-1,1)
Y=np.array([0,0,0,1,1,1]).reshape(-1,1)

The value in X and Y were chosen as binary logic, negative value in X is equal to 0 in Y. And positive value in X should be classified as 1 in Y.

Then I created a model and trained it:

Per = gpflow.kernels.Periodic(gpflow.kernels.SquaredExponential())
model_Per = gpflow.models.VGP((X, Y), likelihood=gpflow.likelihoods.Bernoulli(), kernel=Per)

I tried to predict Y as class with the same X that was used as input for the model training, wanted just to see, if there is the right result.

Ypred, VARpred = model_Per.predict_y(X)

For Ypred I get the output:

    <tf.Tensor: shape=(6, 1), dtype=float64, numpy=
array([[0.5],
       [0.5],
       [0.5],
       [0.5],
       [0.5],
       [0.5]])>

For the VARpred

   <tf.Tensor: shape=(6, 1), dtype=float64, numpy=
array([[0.25],
       [0.25],
       [0.25],
       [0.25],
       [0.25],
       [0.25]])>

I tried, to change the kernel, to combine the kernels, to make an optimization with Scipy before predicting, changed the data, but always the same output for mean and variance. I was expecting, the Ypred = Y with this data set.

What am I doing wrong creating this classification model?

I used `opt = gpflow.optimizers.Scipy() opt.minimize(model_Per.training_loss, variables=model_Per.trainable_variables)` — Vadim5, Jul 04 '20 at 05:57
It's better then 0.5 for each X inm the list, but it's not 1 or 0. Is there any other merthod to pwerform Classification on 1-dimensional data? Or other optimization method that could be recommended? — Vadim5, Jul 04 '20 at 06:01
I would like to test the classification on the data with 800 or even 8000 elements for X and Y, is there a better way in GPflow? — Vadim5, Jul 04 '20 at 06:03

STJ · Answer 1 · 2020-07-09T09:52:48.610

You have to actually optimise your model. Once you optimise it, the results actually look very reasonable. I would not expect a GP model to exactly predict p=1 -- this would mean 0.0% probability of ever observing a 0 at this point, which I would only believe if I had seen an infinite amount of data all saying 1...

For the Bernoulli likelihood you are using, the variance is deterministically related to the mean. If y ~ Bernoulli, and Mean[y] = p, then Var[y] = p * (1 - p). For you, the mean is p=0.5, so the variance is 0.5 * (1 - 0.5) = 0.25.

How to make prediction with GPflow - running GPC with a simple data input? Failed to run the code from example notebook on different data

1 Answers1