4

I am trying to perform inverse inference on a simple bayesian network for piece wise linear regression. That is, y is a piece wise linear function of x :Plot of Y vs X

and the Bayesian network looks like this: Bayesian Network Model

Here, X has a normal distribution, K is a discrete node that has a softmax distribution conditioned on X and Y is a mixture of linear gaussians based on the value of K (i.e. Pr(Y | K=i, X=x) ~ N(mu=w_i*x+b_i, s_i)). I have learned the parameters of this model using EM algorithm. (The actual relationship of Y and X has five linear pieces, but I have learnt using 8 levels for the discrete node). And formed the pymc model using those parameters. Here is the code:

x=pymc.Normal('x', mu=0.5, tau=1.0/0.095)

#The probabilities of discrete node given x=x; Softmax distribution
epower = [-11.818,54.450,29.270,-13.038,73.541,28.466,-57.530,-101.568]
bias = [7.8228,-35.3859,-12.9512,12.8004,-48.1097,-13.2229,30.6079,39.3811]
@pymc.deterministic(plot=False)
def prob(epower=epower,bias=bias,x=x):
    pr=[np.exp(ep*x+bb) for ep, bb in zip(epower, bias)]
    return [pri/np.sum(pr) for pri in pr] 
knode=pymc.Categorical('knode', p=prob)

#The weights of regression 
wtsY=[15.022, -70.000, -14.996,  15.026, -70.000, -14.996,  34.937,  15.027]
#The unconditional means of Y
meansY=[5.9881,68.0000,23.9973,5.9861,68.0000,23.9972,-1.9809,1.9982]
sigmasY=[0.010189,0.010000,0.010033,0.010211,0.010000,0.010036,0.010380,0.010167]
@pymc.deterministic(plot=False)
def condmeanY(knode=knode, x=x,wtsY=wtsY, meansY=meansY):
    return wtsY[knode]*x + meansY[knode]
@pymc.deterministic(plot=False)
def condsigmaY(knode=knode, sigmasY=sigmasY):
    return sigmasY[knode]
y=pymc.Normal('y', mu=condmeanY, tau=1.0/condsigmaY, value=13.5, observed=True)

I want to predict x, when y is observed (inverse inference). As y is (approximately) non-linear in x, there will be multiple solutions for a given value of y. I expect that the obtained trace of x should show those multiple solutions. I have ensured that autocorrelation is very low (sample=2000, burn=1000). But I am not able to see multiple solutions. In the above example, for y=13.5, there are two possible solutions, x=0.5 and x=0.7. But the chain only wanders near 0.5. The histogram has only one peak, at 0.5.

Am I missing something?

EDIT: I came across this very relevant question:Solving inverse problems with PyMC. What I learned from the answer is that the prior of x, which I am assuming to be uni-modal Gaussian here, should have a non-parametric distribution and then the obtained samples after first iteration can be used to update it. Kernel density estimation (with gaussian kernel) has been suggested to obtain non-parametric stochastic from data. I incorporated this in my model but still there is no difference. One thing I noted is that if I do the inference multiple times, approx 50% of the times, I get 0.5, and 50% of the times, I get 0.7 (I am not sure if this was the case earlier as well, because I had not run that model many times to observe this.) But still, should I not see two peaks in the trace after first iteration only?

I also tried with a modified version of this model, where the edge from X to K is reversed. This is a classical conditional linear Gaussian model. Even with this model, I could not get multiple solutions visible in the trace. I am sort of stuck here. Please help.

Community
  • 1
  • 1

0 Answers0