I'm trying to use GPflow to fit a GP. I would like to use automatic relevance determination and a prior for the lengthscales.
I know how to do both separately:
kernel = gpflow.kernels.SquaredExponential(lengthscales=([10] * X_train.shape[1]))
and
kernel.lengthscales.prior = tfp.distributions.Gamma(
to_default_float(3), to_default_float(0.25)
)
...but I would like to do both (so basically a different Gamma distribution as a prior for each feature).
I tried just using both lines of code and there is no error, but the prior does not seem to add or change anything.
How can I combine both these things?
EDIT: I played with it some more and I thought maybe it would not matter that much, as the lengthscales are adjusted during training. However, the starting point of the lengthscales has a significant impact on the accuracy of my model, and they never change dramatically from the starting point.
For instance, initializing with lengthscales = 10 gives optimized lengthscales between 7 - 13, 15 gives 12-18, etc. Initialising with smaller lengthscales such as 0.1 or 1 leads to lengthscales closer to 10.
Still, I think it would be very valuable if it would be possible to set a prior for every feature as to use ARD. I might investigate next if that is (only?) possible using MCMC methods.