Mixed Pareto and Normal Stan model not working

Question

I'm trying to learn Stan via rstan (since I'm familiar with R). I've tried running a simple mixed Pareto and Normal model. It compiles fine (as far as I can tell), but it fails to sample, giving me the error:

"Initialization between (-2, 2) failed after 100 attempts. Try specifying initial values, reducing ranges of constrained values, or reparameterizing the model.

error occurred during calling the sampler; sampling not done"

Suffice to say I've tried various ways to parameterize things, and tried setting initial values, but all to no avail.

My R+rstan code is below:

library(rstan)
rpareto = function(n, location, shape){location/runif(n)^(1/shape)}
sdvec=runif(1e3,0.1,1)
HMFtest=list(x=rpareto(1e3,10,2)+rnorm(1e3,0,sdvec), sdev=sdvec, N=1e3)

HMF.stan <- "
data {
  int<lower=0> N;
  real x[N];
  real sdev[N];
}
parameters {
  real<lower=0,upper=20> y_min;
  real<lower=0,upper=4> alpha;
  real xtrue[N];
}
model {
  y_min ~ lognormal(1, 1);
  alpha ~ lognormal(1, 1);
  xtrue ~ pareto(y_min, alpha);
  for(i in 1:N){
    x[i] ~ normal(xtrue[i], sdev[i]);
  }
}
"

stan.test <- stan(model_code=HMF.stan, data=HMFtest, pars=c('y_min','alpha'), chains=1, iter=30000, warmup=10000)

This example works fine with JAGS (hence I've tagged JAGS too) and I can post that code is it is helpful.

Incidentally, if I change the Pareto distribution to an additional normal distribution it runs fine (but gives me a nonsense answer, of course).

Any suggestions as to what I'm doing wrong would be much appreciated! I fear somehow I'm still thinking JAGS not Stan, but I couldn't find any examples of people fitting Pareto models with Stan, so it has been hard for me to cross validate my approach.

Example of experimental model code (super-restrictive +ve priors close to the correct solution) that still doesn't sample: HMF.stan <- " data { int N; real x[N]; real sdev[N]; } parameters { real y_min; real alpha; real xtrue[N]; } model { y_min ~ normal(10, 5) T[5,15]; alpha ~ normal(2, 5) T[1,3]; xtrue ~ normal(y_min, alpha); for(i in 1:N){ x[i] ~ normal(xtrue[i], sdev[i]); } } " — ASGR, Feb 07 '15 at 15:08

j13r · Answer 1 · 2015-02-07T20:21:04.903

The error message means that all the random starting points tried yielded a likelihood of zero.

I was able to reproduce your problem in Stan, with model

data {
  int<lower=0> N;
  real x[N];
  real sdev[N];
}
parameters {
  real<lower=0,upper=20> y_min;
  real<lower=0,upper=4> alpha;
  real xtrue[N];
}
model {
  y_min ~ lognormal(1, 1);
  alpha ~ lognormal(1, 1);
  print("y_min=", y_min, " alpha=", alpha);
  xtrue ~ pareto(y_min, alpha);
  print("xtrue: ", xtrue);
  x ~ normal(xtrue, sdev);
  print("x=", x);
}

and data

N <- 6
sdev <- c(0.3339302,0.2936877,0.8540434,0.2399283,0.1014759,0.3717446)
x <- c(12.640112,10.502748,11.015629,29.382395,61.180509,12.772482)

Compiling and running with Stan 2.0.1 (quite old now) I get output like the following:

y_min=4.49609:0 alpha=2.54906:0
xtrue: [0.992331:0,0.303142:0,0.180334:0,1.96009:0,0.903113:0,1.75711:0]
x=[12.6401,10.5027,11.0156,29.3824,61.1805,12.7725]
y_min=17.0143:0 alpha=1.67509:0
xtrue: [-1.40618:0,1.82026:0,1.67344:0,-0.973618:0,0.746502:0,1.93469:0]
x=[12.6401,10.5027,11.0156,29.3824,61.1805,12.7725]

So while reasonable parameters are being chosen for y_min and alpha, the pareto-generated values are also below y_min. In the manual, the probability distribution function does not contain the truncation either. I think that is the problem (replacing the pareto with a normal distribution runs fine). ~~I recommend opening a bug with Stan on github, stating that x ~ pareto(y_min, alpha) generates values below y_min.~~

Code works with the latest Stan version. Please upgrade first, the bug seems to have been fixed some time ago.

As far as I can tell I'm using the latest R (3.1.2) / rstan (2.6.0) / stan. I'm using Mountain Lion (10.8.0) though. What's your setup? — ASGR, Feb 08 '15 at 02:27
I just fetched the latest version of Stan and CmdStan from git. — j13r, Feb 08 '15 at 16:35

score 2 · Answer 2 · answered Feb 08 '15 at 03:57

The fundamental problem is the mismatch between the support declared on the parameters parameters { real<lower=0,upper=20> y_min; real<lower=0,upper=4> alpha; real xtrue[N]; } and the sample space of the priors model { y_min ~ lognormal(1, 1); alpha ~ lognormal(1, 1); xtrue ~ pareto(y_min, alpha); ...

y_min is constrained to the (0,20) interval but a lognormal prior spreads a unit of mass over the entire positive real line
alpha is constrained to the (0,20) interval but a lognormal prior spreads a unit of mass over the entire positive real line
Worst of all, each element of xtrue is unconstrained --- meaning it is allowed to be anything over the entire real line --- but a pareto prior spreads a unit of mass over the interval (y_min,Infinity)

The easiest thing to do would be to declare the parameters as parameters { real<lower=0> y_min; real<lower=0> alpha; real<lower=y_min> xtrue[N]; } In principle, you could keep the upper bounds on y_min and alpha and specify some priors that integrate to 1 over the declared support. A crude way to do that is to truncate (which will divide the lognormal PDF by the amount of untruncated mass) the lognormal priors like model { y_min ~ lognormal(1, 1) T[,20]; alpha ~ lognormal(1, 1) T[,4]; Perhaps a uniform or a four parameter beta distribution would be more appropriate than a truncated lognormal.

Finally, although it is not logically wrong for(i in 1:N){ x[i] ~ normal(xtrue[i], sdev[i]); } is much worse computationally than the logically equivalent statement x ~ normal(xtrue, sdev);

If I apply pretty restrictive ranges: parameters { real y_min; real alpha; real xtrue[N]; } model { y_min ~ lognormal(1, 1) T[9,11]; alpha ~ lognormal(1, 1) T[1, 3]; xtrue ~ pareto(y_min, alpha); x ~ normal(xtrue, sdev); } And I print y_min, alpha and xtrue I can see that it errors when: y_min=11 alpha=1 xtrue=[inf, 8.02369e+219, 1.23259e+48, inf, 2.94639e+23, inf, 11, 5.25037e+21, 11, 11] i.e. it looks like it fails when it samples right at the numeric limit. This doesn't happen with other distributions- the normal is okay. — ASGR, Feb 08 '15 at 07:12

score 0 · Answer 3 · answered Feb 09 '15 at 06:17

All fixed folks. It turns out there were some deep problems with my mixture of R, stan, and the c++ compiler.

Not wanting to faff about with fixing things by hand I took the sledgehammer approach and just upgraded to Yosemite, and then installed the key components from scratch. This appears to fix everything, and my chains are converging very nicely now.

This was a weird one, since the compiler could build rstan and compile/sample many of the rstan examples. I don't know why the pareto caused these issues, but they are certainly fixed now.

Mixed Pareto and Normal Stan model not working

3 Answers3