How to extract Posterior samples of Log Likelihood from PyStan?

Question

I need posterior samples of log likelihood terms to run PSIS here such that

log_lik : ndarray
    Array of size n x m containing n posterior samples of the log likelihood
    terms :math:`p(y_i|\theta^s)`.

where small example here is such that pip install pystan and

import pystan
schools_code = """
data {
    int<lower=0> J; // number of schools
    real y[J]; // estimated treatment effects
    real<lower=0> sigma[J]; // s.e. of effect estimates
}
parameters {
    real mu;
    real<lower=0> tau;
    real eta[J];
}
transformed parameters {
    real theta[J];
    for (j in 1:J)
    theta[j] = mu + tau * eta[j];
}
model {
    eta ~ normal(0, 1);
    y ~ normal(theta, sigma);
}
"""

schools_dat = {'J': 8,
               'y': [28,  8, -3,  7, -1,  1, 18, 12],
               'sigma': [15, 10, 16, 11,  9, 11, 10, 18]}

sm = pystan.StanModel(model_code=schools_code)
fit = sm.sampling(data=schools_dat, iter=1000, chains=4)

How can I get the posterior samples of Log Likelihood of the PyStan fit model?

score 4 · Answer 1 · answered Nov 16 '18 at 09:58

I believe correct way to calculate log likelihood in this case is following:

generated quantities {
    vector[J] log_lik;
    for (i in 1:J)
        log_lik[i] = normal_lpdf(y[i] | theta, sigma);
}

After that you can run psis following:

loo, loos, ks = psisloo(fit['log_lik'])
print('PSIS-LOO value: {:.2f}'.format(loo))

Full code would then become as:

import pystan
from psis import psisloo
schools_code = """
data {
    int<lower=0> J;            // number of schools
    real y[J];                 // estimated treatment effects
    real<lower=0> sigma[J];    // s.e. of effect estimates
}
parameters {
    real mu;
    real<lower=0> tau;
    real eta[J];
}
transformed parameters {
    real theta[J];
    for (j in 1:J)
       theta[j] = mu + tau * eta[j];
}
model {
    eta ~ normal(0, 1);
    y ~ normal(theta, sigma);
}
generated quantities {
    vector[J] log_lik;
    for (i in 1:J)
         log_lik[i] = normal_lpdf(y[i] | theta, sigma);
}
"""

schools_dat = {'J': 8,
               'y': [28,  8, -3,  7, -1,  1, 18, 12],
               'sigma': [15, 10, 16, 11,  9, 11, 10, 18]}

sm = pystan.StanModel(model_code=schools_code) 
fit = sm.sampling(data=schools_dat, iter=1000, chains=4)
loo, loos, ks = psisloo(fit['log_lik'])
print('PSIS-LOO value: {:.2f}'.format(loo))

There is also a convenience function in the R package loo relating to your example here: https://mc-stan.org/loo/reference/extract_log_lik.html — user14241, Mar 24 '21 at 03:33

score 3 · Accepted Answer · answered Mar 09 '18 at 13:28

3

You can get the posterior samples of Log-Likelihood by doing: logp = fit.extract()['lp__']

answered Mar 09 '18 at 13:28

Junpeng Lao

424
2
6

1

This is wrong. `lp__` is the log-*posterior* not the log-*likelihood*. Therefore `lp__` also includes contributions from your prior densities and any Jacobian adjustments necessary to allow sampling to happen on the unconstrained scale. – jwalton Nov 11 '19 at 15:34
This is wrong. The PSIS-LOO (and WAIC) requires the log-likelihood, not the log-posterior. – Nadav Sep 08 '20 at 17:48

How to extract Posterior samples of Log Likelihood from PyStan?

2 Answers2