curious_dan

133
reputation
5

2.) Why does integrating out the parameter space apply a principled penalty on the model score? ---Gu chapter 1

3.) What is a probabilistic justification for bootstrapping an estimate. Why/when does it match the posterior distribution of an estimate? Does this mean bootstrapping is getting the distribution of the estimate conditioned on the data observations?

4.) There are an infinite number of ways to create a variable. So what does the NUCA really mean? If I observe state, but I do not observe city? If I observe the last year, but not the last month. Time and Space can be cut into an uncountable number of covariates. What if I have grue and bleen, as opposed to blue and green? And Jerry Fodor's work on mapping the tokens of laws between special sciences. He shows this is not one to one. Then is no unobserved confounders the union of all of those tokens?

5.) How do you assess if you have the right amount of common support to use inverse probability weighting?

6.) What bias does the confounding adjustment formula fix? What are causal inference methods for other kinds of bias? Selection bias, etc.

7.) Does a power analysis need to be done differently for confounded data? How does common support play into a power analysis when TE is confounded?

8.) Is there a way to generalize grid search to estimate a posterior, instead of MCMC? What assumptions would you have to make? Is there a way to capture those assumptions?

10.) Is the Bunzl argument for frequentism that the prior distribution is conditioned on prior data? And if so, what is the weight of that prior data? Why do we treat it like a single observation? If I have a study that says theta=2 based on 1000 samples, shouldn't my prior hold the weight of 1000 samples?

11.) Can a DAG really represent the causal structure of elements measured within a time window? What assumptions are being made about the measurement and the time point it represents? Just because I measure something now, doesn't mean that value came about at the time of measurement. I wonder if there is a counter example where P(M|data) is a collider, even though the actual generating model is not a collider.

12.) Can we use sums/averages/medians or other summary statistics in DAGs?

13.) Can you learn the missingness graph by occams razor?

16.) Keep track of the evidence SUM(e^-bic) and somehow use this as the MCMC develops

17.) What is the dependence of two counterfactual random variables?

18.) Is there a identifiable criterion to know if a probability marginalization is analytically tractable? Is there a identifiable criterion to know if there is an obvious envelope distribution of a distribution of interest, in order to apply a rejection sampler?

19.) What exactly is a dedekind cut? Is the continuum hypothesis a matter of memory in sets? What if that set of all subsets of N is reduced by some form of memory adjustment(only the last n elements)?