4

I'm playing around with PyMC3, trying to fit a modified version of the mining disaster switchpoint model in the PyMC3 documentation. Suppose you had two coal-mines (mine1 and mine2), each with similar disaster counts for the same range of years.

However, mine1 was 5 years late in implementing the change in safety procedures that lowered the disaster count:

import numpy as np
import matplotlib.pyplot as plt

mine1=np.array([0,4,5,4,0,1,4,3,4,0,6,3,3,4,0,2,6,3,3,5,4,5,3,1,4,4,1,5,5,3,4,2,5,2,2,3,
       4,2,1,3,0,2,1,1,1,1,3,0,0,1,0,1,1,0,0,3,1,0,3,2,2,0,1,1,1,0,1,0,1,0,0,0,
       2,1,0,0,0,1,1,0,2,3,3,1,0,2,1,1,1,1,2,4,2,0,0,1,4,0,0,0,1]);
mine2=np.array([3,3,4,0,2,6,2,3,4,3,7,4,1,5,4,1,5,5,3,4,1,6,2,2,2,4,4,0,4,0,3,3,1,0,3,2,
       0,0,1,0,1,1,0,0,3,0,0,3,1,1,0,1,1,1,0,0,0,0,1,1,1,3,1,0,1,0,0,2,0,1,2,2,
       0,0,3,3,0,2,3,2,4,2,0,0,1,3,0,0,1,2,0,1,1,0,0,2,0,2,0,0,0]);

both_mines = mine1+mine2;

years = np.arange(1849,1950);

fig, axs = plt.subplots(2);
axs[0].plot(years, both_mines,'ko');
axs[0].legend(['mines_summed'],loc='upper right');
axs[0].set_ylabel('disaster count')
axs[1].plot(years, mine1,'ro');
axs[1].plot(years, mine2,'bo');
axs[1].legend(['mine1','mine2'],loc='upper right');
axs[1].set_ylabel('disaster count')

plots, summed vs separate mining disasters for 2 mines

I'm interested in testing whether a better model fit results from summing the yearly counts and fitting a single switchpoint to this summed count time series, or fitting a separate model to the two mines.

Model 1 - Single model for sum across mines

import pymc3 as pm    
with pm.Model() as model1:
    switchpoint = pm.DiscreteUniform('switchpoint', lower=years.min(), upper=years.max());
    early_rate = pm.Exponential('early_rate', 1)
    late_rate = pm.Exponential('late_rate', 1)
    rate = pm.math.switch(switchpoint >= years, early_rate, late_rate)
    disasters_both_mines = pm.Poisson('disasters_both_mines', rate, observed=both_mines)
    trace1 = pm.sample(10000,tune=2000);
    pm.traceplot(trace1)

Yields fits very similar to the documentation example. Here is the trace plot:

posteriors - model 1

When it comes to fitting the model that keeps the mines separate, I've tried two approaches that are both sub-optimal for different reasons. The first is to fit the two data likelihoods, separately for each mine.

Model 2a - Separate mines, two likelihoods

with pm.Model() as model2a:
    switchpoint_mine1 = pm.DiscreteUniform('switchpoint_mine1', lower=years.min(), upper=years.max());
    switchpoint_mine2 = pm.DiscreteUniform('switchpoint_mine2', lower=years.min(), upper=years.max());
    early_rate_sep = pm.Exponential('early_rate2', 1,shape=2)
    late_rate_sep = pm.Exponential('late_rate2', 1,shape=2)
    
    rate_mine1 = pm.math.switch(switchpoint_mine1>=years, early_rate_sep[0], late_rate_sep[0]);
    rate_mine2 = pm.math.switch(switchpoint_mine2>=years, early_rate_sep[1], late_rate_sep[1]);
    
    disasters_mine1 = pm.Poisson('disasters_mine1', rate_mine1, observed=mine1);
    disasters_mine2 = pm.Poisson('disasters_mine2', rate_mine2, observed=mine2);
    trace2a = pm.sample(10000,tune=2000);
    pm.traceplot(trace2a);

Nice looking fit

The fit looks nice and seems sensitive to the difference in switchpoint. However, I can't compute a WAIC or LOO value, meaning I can't commpare the fit to Model 1. I'm guessing since there's two sets of observations?

e.g.

pm.waic(trace2a)
Traceback (most recent call last):

  File "<ipython-input-270-122a6fb53049>", line 1, in <module>
    pm.waic(trace2a)

  File "<home dir>/opt/anaconda3/lib/python3.7/site-packages/pymc3/stats/__init__.py", line 24, in wrapped
    return func(*args, **kwargs)

  File "<home dir>/opt/anaconda3/lib/python3.7/site-packages/arviz/stats/stats.py", line 1164, in waic
    raise TypeError("Data must include log_likelihood in sample_stats")

TypeError: Data must include log_likelihood in sample_stats

The second idea was to use a similar approach to the Hierarchical Linear Regression example and use a combination of concatenation, indexing and the shape output on the priors, to fit a vector of each parameter and a single data likelihood.

Model 2b - Separately indexed mines, single likelihood function

mine1_ind = np.ones(101,dtype=int)-1
mine2_ind = np.ones(101,dtype=int)*1
mine_ix = np.concatenate((mine1_ind,mine2_ind), axis=0);
concat_mines = np.concatenate((mine1,mine2), axis=0);
concat_years = np.transpose(np.concatenate((years,years), axis=0));

with pm.Model() as model2b:
    switchpoint_mine1and2 = pm.DiscreteUniform('switchpoint_mine1and2', lower=years.min(), upper=years.max(),shape=2);
    early_rate_mine1and2 = pm.Exponential('early_rate_mine1and2', 1,shape=2);
    late_rate_mine1and2 = pm.Exponential('late_rate_mine1and2', 1,shape=2);   
    
    rate_mine1and2 = pm.math.switch(switchpoint_mine1and2[mine_ix]>=concat_years[mine_ix], early_rate_mine1and2[mine_ix], late_rate_mine1and2[mine_ix]);       
    
    disasters_mine1and2 = pm.Poisson('disasters_mine1and2', rate_mine1and2, observed=concat_mines);
    trace2b = pm.sample(10000,tune=2000);

This model fits, and allows a WAIC to be computed. However looking at the posteriors, it couldn't fit the switchpoints.

posterior

So to summarize, is there a way to either fit Model2a in a manner that allows a WAIC to be computed, or is there any change that could be made to Model2b that makes it fit better posteriors?

Many thanks for any help.

OriolAbril
  • 7,315
  • 4
  • 29
  • 40
Sham Doran
  • 41
  • 2

1 Answers1

2

I don't have a definite answer, but here is some advise that should help you get things working.

First begin by updating ArviZ to its latest version, from the error message it looks like your version is older than the first version with multiple likelihood support. Even though it looks like you are using PyMC3 functions, PyMC3 delegates its plotting and statistics to ArviZ.

Then, I would recommend taking a look at ArviZ's educational resources. There currently is an open PR to add guidance on this kind of issues. Here is a link to the notebook. I think it is in an advanced enough state to be useful. If it weren't, there are other questions on here on SO or in PyMC3 discourse 1, 2. These should cover some extra examples.

Finally, here are the key ideas from these detailed answers. The first key point is that there is no one right answer, depending on the question you want to ask, waic/loo can be calculated in different ways. Second key idea is that ArviZ let's you choose how to calculate waic/loo to adapt to all possible questions, so in multiple likelihoods cases, post processing of the data in the log_likelihoods group is needed.

Dharman
  • 30,962
  • 25
  • 85
  • 135
OriolAbril
  • 7,315
  • 4
  • 29
  • 40