1

I'm currently using keras to create a neural net in python. I have a basic model and the code looks like this:

    from keras.layers import Dense
    from keras.models import Sequential

    model = Sequential()
    model.add(Dense(23, input_dim=23, kernel_initializer='normal', activation='relu'))
    model.add(Dense(500, kernel_initializer='normal', activation='relu'))
    model.add(Dense(1, kernel_initializer='normal', activation="relu"))
    model.compile(loss='mean_squared_error', optimizer='adam')

It works well and gives me good predictions for my use case. However, I would like to be able to use a Variational Gaussian Process layer to give me an estimate for the prediction interval as well. I'm new to this type of layer and am struggling a bit to implement it. The tensorflow documentation on it can be found here:

https://www.tensorflow.org/probability/api_docs/python/tfp/layers/VariationalGaussianProcess

However, I'm not seeing that same layer in the keras library. For further reference, I'm trying to do something similar to what was done in this article:

https://blog.tensorflow.org/2019/03/regression-with-probabilistic-layers-in.html

There seems to be a bit more complexity when you have 23 inputs vs one that I'm not understanding. I'm also open to other methods to achieving the target objective. Any examples on how to do this or insights on other approaches would be greatly appreciated!

bballboy8
  • 400
  • 6
  • 25

1 Answers1

4

tensorflow_probability is a separate library but suitable to use with Keras and TensorFlow. You can add those custom layers in your code and change it to a probabilistic model. If your goal is just to get a prediction interval it would be simpler to use the DistributionLambda layer. So your code would be as follows:

from keras.layers import Dense
from keras.models import Sequential
from sklearn.datasets import make_regression
import tensorflow_probability as tfp
import tensorflow as tf 
tfd = tfp.distributions

# Sample data 
X, y = make_regression(n_samples=100, n_features=23, noise=4.0, bias=15)

# loss function Negative log likelyhood
negloglik = lambda y, p_y: -p_y.log_prob(y)

# Model
model = Sequential()
model.add(Dense(23, input_dim=23, kernel_initializer='normal', activation='relu'))
model.add(Dense(500, kernel_initializer='normal', activation='relu'))
model.add(Dense(2))
model.add(tfp.layers.DistributionLambda(
      lambda t: tfd.Normal(loc=t[..., :1],
                           scale=1e-3 + tf.math.softplus(0.05 * t[..., 1:]))))

model.compile(loss=negloglik, optimizer='adam')

model.fit(X,y, epochs=250, verbose=None)

enter image description here

After training your model, you can get your prediction distribution with the following lines:

yhat = model(X) # make predictions
means = yhat.mean() # prediction means 
stds = yhat.stddev() # prediction standard deviation
ahmet hamza emra
  • 580
  • 4
  • 15
  • If I'm understanding correctly, the standard deviation here will vary. Is that correct? – bballboy8 Feb 12 '22 at 04:55
  • Also I get an error at the std line. ``` AttributeError: 'Normal' object has no attribute 'std' ``` – bballboy8 Feb 12 '22 at 05:16
  • It should be `yhat.stddev()` – ahmet hamza emra Feb 12 '22 at 05:22
  • Yes, it gives prediction distribution for every prediction made. Means and standard deviation will vary – ahmet hamza emra Feb 12 '22 at 05:24
  • stds is just a list of 1s. I don't think this is working as intended. – bballboy8 Feb 12 '22 at 19:11
  • Scale parameter controls the std of the predicitons, In the example I gave, we made the essumtion of setting them to 1. I made the changes so now also std will be learned – ahmet hamza emra Feb 12 '22 at 19:56
  • I think there is an error with the changes you made. During training, the model has no loss: `43749/48339 [==========================>...] - ETA: 3s - loss: 0.0000e+00` Furthermore, the value of means and std is this: `tf.Tensor([], shape=(773411, 0), dtype=float32)` – bballboy8 Feb 12 '22 at 23:24
  • https://stackoverflow.com/questions/772124/what-does-the-ellipsis-object-do – ahmet hamza emra Feb 12 '22 at 23:31
  • Does that mean I need to slice the means and stds? – bballboy8 Feb 12 '22 at 23:34
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/241965/discussion-between-ahmet-hamza-emra-and-bballboy8). – ahmet hamza emra Feb 12 '22 at 23:34
  • This produces drastically different predictions than the model in the original post. I need those same predictions since they were much more accurate. – bballboy8 Feb 13 '22 at 00:15
  • 1
    At this point issue is on the data you are training on. You should open another question for that – ahmet hamza emra Feb 13 '22 at 00:18
  • I don't think it is. I don't think I understand this `model.add(Dense(2))` I don't have two values for training. Just my actual values. So I don't understand how we can expect an output of 2 nodes. – bballboy8 Feb 13 '22 at 01:50
  • You need one of them to get mean and other to get std. You are not making two regression prediction, you are making one distrubution prediction. – ahmet hamza emra Feb 15 '22 at 00:05