I would appreciate any help to do this:
- for each
P
fit the model for each column ofweights
.- do the step 1 for all observations in the dataset to get
p * w
submodels, wherep
is the number ofP
andw
is the number of columns ofweights
.- pool the results by combining the posterior samples of the submodels.
A similar brms
implementation is available here, also discussed here. But that allows only incorporation of uncertainty coming from multiple imputation P
. I want to account for the uncertainty coming from the estimation of the weights
as well.
@paul.buerkner says there that we could also extract those data sets and then pass them to the actual model fitting function as a list of data frames
. Using this approach, the solution would probably be to pass my individual p * w
data frames, each with p=1
and w=1
. Still, it is not clear how the list of these data frames could in practice be passed to the model fitting function.
In my case, the multiply imputted values of the predictors are in long format with each imputation stage represented by a value of the variable P
, and the set of weights
estimated at each P
are in wide format and represented by the variables w_1
to w_10
.
There is also the stan
approach discussed here. This requires calling "stan by parallel and it returns a list of stanfit objects," and then use sflist2stanfit
to create one stanfit object from the list. Here the problem is that I would need to separate my dataframe into p * w = 50
, or more than 100 datasets (if I increase the number of imputations P
to more than 10 as generally recommended), each with p=1
and w=1
.
Following @BobCarpenter's recommendation, below I present my attempt to parallelize the likelihood. It seems this approach could allow me to account for one of the sources of uncertainty separately, but not both jointly as intended. Here I am not certain how to specify the transformed parameters
block and the likelihood
to account for the uncertainty from P
too.
Any help to improve and/or correct my current attempt to accomplish my objective would be much appreciated. Any input to implement any of the other approaches discussed above would also be much appreciated. I apologise if you find any basic mistakes in my code. While I am certain about what I want to do and why, the stan implementation of my ideas is still challenging - still learning.
#model
data{
int N;
int P;
int ncases[N];
int A[N];
int B[N];
int nn[N];
int id[N];
real w_1[N];
real w_2[N];
real w_3[N];
real w_4[N];
real w_5[N];
real w_6[N];
real w_7[N];
real w_8[N];
real w_9[N];
real w_10[N];
}
parameters {
real beta_0;
real beta_1;
real beta_2;
real beta_3;
}
transformed parameters {
vector[N] pp_hat;
vector[N] odds;
for (i in 1:N) {
odds[i] = exp(beta_0) * (1 + beta_1*A[i] + beta_2*B[i] + beta_3*A[i]*B[i]);
pp_hat[i] = odds[i]/(1 + odds[i]);
}
}
model {
beta_0 ~ normal(0, 5);
beta_1 ~ normal(0, 5);
beta_2 ~ normal(0, 5);
beta_3 ~ normal(0, 5);
for (i in 1:N){
target += w_1[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_2[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_3[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_4[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_5[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_6[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_7[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_8[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_9[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
target += w_10[i] * binomial_lpmf(ncases[i] | nn[i], pp_hat[i]);
}
}
Thanks in advance.