7

Suppose we're given a prior on X (e.g. X ~ Gaussian) and a forward operator y = f(x). Suppose further we have observed y by means of an experiment and that this experiment can be repeated indefinitely. The output Y is assumed to be Gaussian (Y ~ Gaussian) or noise-free (Y ~ Delta(observation)).

How to consistently update our subjective degree of knowledge about X given the observations? I've tried the following model with PyMC, but it seems I'm missing something:

from pymc import *

xtrue = 2                        # this value is unknown in the real application
x = rnormal(0, 0.01, size=10000) # initial guess

for i in range(5):
    X = Normal('X', x.mean(), 1./x.var())
    Y = X*X                        # f(x) = x*x
    OBS = Normal('OBS', Y, 0.1, value=xtrue*xtrue+rnormal(0,1), observed=True)
    model = Model([X,Y,OBS])
    mcmc = MCMC(model)
    mcmc.sample(10000)

    x = mcmc.trace('X')[:]       # posterior samples

The posterior is not converging to xtrue.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
juliohm
  • 3,691
  • 2
  • 18
  • 22

2 Answers2

7

The functionality purposed by @user1572508 is now part of PyMC under the name stochastic_from_data() or Histogram(). The solution to this thread then becomes:

from pymc import *
import matplotlib.pyplot as plt

xtrue = 2 # unknown in the real application
prior = rnormal(0,1,10000) # initial guess is inaccurate
for i in range(5):
  x = stochastic_from_data('x', prior)
  y = x*x
  obs = Normal('obs', y, 0.1, xtrue*xtrue + rnormal(0,1), observed=True)

  model = Model([x,y,obs])
  mcmc = MCMC(model)
  mcmc.sample(10000)

  Matplot.plot(mcmc.trace('x'))
  plt.show()

  prior = mcmc.trace('x')[:]
juliohm
  • 3,691
  • 2
  • 18
  • 22
5

The problem is that your function, $y = x^2$, is not one-to-one. Specifically, because you lose all information about the sign of X when you square it, there is no way to tell from your Y values whether you originally used 2 or -2 to generate the data. If you create a histogram of your trace for X after just the first iteration, you will see this:histogram of trace after first iteration

This distribution has 2 modes, one at 2 (your true value) and one at -2. At the next iteration, x.mean() will be close to zero (averaging over the bimodal distribution), which is obviously not what you want.

jcrudy
  • 3,921
  • 1
  • 24
  • 31
  • I know f(x) isn't a bijection, it was chosen because of that specific reason. I can't see any argument for MCMC to fail with this input distribution, as far as I know, MCMC is able to sample complex multi-modal distributions. – juliohm Jul 02 '13 at 12:00
  • Oh, I understood your point, the problem is in how I update the prior. By simply using pos.mean() and pos.var() I'm assuming a unimodal solution. How would you solve this problem for finding both 2 and -2? – juliohm Jul 02 '13 at 12:05
  • 1
    In other words, how to represent non-parametric distributions with PyMC? Given the trace, produce a PDF that matches the histogram. – juliohm Jul 02 '13 at 12:11
  • 2
    I've encountered that same problem before. I used a kernel density estimator to create a nonparametric prior that I could update. It's kind of slow, but it seems to work well enough for simple problems. Here's a gist using your example: https://gist.github.com/jcrudy/5911624 – jcrudy Jul 02 '13 at 18:10
  • Kernel density estimation works fine for multimodal distributions. Check out its wikipedia page, and particularly this example: http://en.wikipedia.org/wiki/File:Comparison_of_1D_histogram_and_KDE.png. In short, although the kernel is gaussian, the resulting distribution estimate, which is a weighted sum of location shifted gaussian kernels, need not be. The method you linked to assumes a gaussian distribution, if I'm understanding it correctly. You would have to modify it for a bimodal distribution, and would need to specify a parametric form for that bimodal distribution. – jcrudy Jul 02 '13 at 21:26