4

I was porting the example of a Simple Bayesian Network via Monte Carlo Markov Chain from PyMC2 to PyMC3 and it works.
The result can be found in the following gist on GitHub in the file pymc3_rain_sprinkler_grass_simple_bayesian_network.py.

I wanted to extend the original example by providing evidence, e.g. that the grass is wet and then let PyMC3 give me the answer for questions like "given grass is wet, what is the probability that it has rained?".

It seems that the resulting trace is "constant", e.g. there is no element of randomness in it any more. Have a look at pymc3_rain_sprinkler_grass_simple_bayesian_network_with_evidence.py in the gist and execute the df.drop_duplicates() to see what I mean.

What am I doing wrong?

Sam Denty
  • 3,693
  • 3
  • 30
  • 43
cs224
  • 328
  • 2
  • 12
  • I've updated the gist with the same code in `PyMC2`. The `PyMC2` version does not suffer from the same problem as the PyMC3 version. Any ideas? – cs224 Feb 28 '17 at 14:54
  • In [pymc3-multiple-observed-values](http://stackoverflow.com/questions/24242660/pymc3-multiple-observed-values) I've found the following statement: "There is nothing fundamentally wrong with your approach, except for the pitfalls of any Bayesian MCMC analysis: (1) non-convergence, (2) the priors, (3) the model." I think my case here is a similar case and I will continue to tune the solution. If I provide as evidence that the grass is not wet rather than that the grass is wet the `PyMC3` version works, too. I'll continue to update my findings here and in the gist. – cs224 Mar 01 '17 at 08:43

1 Answers1

4

I managed to solve my problem. The main point was to set testval to "true" rather than "false". It improved the situation to change the step method from Metropolis to BinaryGibbsMetropolis.

For reference here is the complete solution. I also updated the gist.

import numpy as np
import pandas as pd
import pymc3 as pm

niter = 10000  # 10000
tune = 5000  # 5000

model = pm.Model()

with model:
    tv = [1]
    rain = pm.Bernoulli('rain', 0.2, shape=1, testval=tv)
    sprinkler_p = pm.Deterministic('sprinkler_p', pm.math.switch(rain, 0.01, 0.40))
    sprinkler = pm.Bernoulli('sprinkler', sprinkler_p, shape=1, testval=tv)
    grass_wet_p = pm.Deterministic('grass_wet_p', pm.math.switch(rain, pm.math.switch(sprinkler, 0.99, 0.80), pm.math.switch(sprinkler, 0.90, 0.0)))
    grass_wet = pm.Bernoulli('grass_wet', grass_wet_p, observed=np.array([1]), shape=1)

    trace = pm.sample(20000, step=[pm.BinaryGibbsMetropolis([rain, sprinkler])], tune=tune, random_seed=124)

# pm.traceplot(trace)

dictionary = {
              'Rain': [1 if ii[0] else 0 for ii in trace['rain'].tolist() ],
              'Sprinkler': [1 if ii[0] else 0 for ii in trace['sprinkler'].tolist() ],
              'Sprinkler Probability': [ii[0] for ii in trace['sprinkler_p'].tolist()],
              'Grass Wet Probability': [ii[0] for ii in trace['grass_wet_p'].tolist()],
              }
df = pd.DataFrame(dictionary)

p_rain = df[(df['Rain'] == 1)].shape[0] / df.shape[0]
print(p_rain)

p_sprinkler = df[(df['Sprinkler'] == 1)].shape[0] / df.shape[0]
print(p_sprinkler)
cs224
  • 328
  • 2
  • 12
  • I've updated the gist with a pyjags version. If I don't specify the correct starting conditions for JAGS it gives me an error message: `console.JagsError: Error in node grass_wet; Observed node inconsistent with unobserved parents at initialization. Try setting appropriate initial values.` It would be nice if PyMC3 would give similar warnings. – cs224 Mar 02 '17 at 03:31