I am estimating a model using the pyMC3 library in python. In my "real" model, there are four parameter arrays, two of which have over 170,000 parameters in them. Summarising this array of parameters is too computationally intensive on my computer. I have been trying to figure out if the summary function in arviz will allow me to only summarise one (or a small number) of parameters in the array. Below is a reprex where the same problem is present, though the model is a lot simpler. In the linear regression model below, the parameter array b
has three parameters in it b[0]
, b[1]
, b[2]
. I would like to know how to get the summary for just b[0]
and b[1]
or alternatively for just a single parameter, e.g., b[0]
.
import pandas as pd
import pymc3 as pm
import arviz as az
d = pd.read_csv("https://quantoid.net/files/mtcars.csv")
mpg = d['mpg'].values
hp = d['hp'].values
weight = d['wt'].values
with pm.Model() as model:
b = pm.Normal("b", mu=0, sigma=10, shape=3)
sig = pm.HalfCauchy("sig", beta=2)
mu = pm.Deterministic('mu', b[0] + b[1]*hp + b[2]*weight)
like = pm.Normal('like', mu=mu, sigma=sig, observed=mpg)
fit = pm.fit(10000, method='advi')
samp = fit.sample(1500)
with model:
smry = az.summary(samp, var_names = ["b"])
It looked like the coords
argument to the summary()
function would do it, but after googling around and finding a few examples, like the one here with plot_posterior()
instead of summary()
, I was unable to get something to work. In particular, I tried the following in the hopes that it would return the summary for b[0]
and b[1]
.
with model:
smry = az.summary(samp, var_names = ["b"], coords={"b_dim_0": range(1)})
or this to return the summary of b[0]
:
with model:
smry = az.summary(samp, var_names = ["b"], coords={"b_dim_0": [0]})
I suspect I am missing something simple (I'm an R user who dabbles occasionally with Python). Any help is greatly appreciated.
(BTW, I am using Python 3.8.0, pyMC3 3.9.3, arviz 0.10.0)