In statistics, a confidence interval is a measure of the precision of an estimate of an underlying parameter. In principle, if the estimate is computed several times on independent samples of data, the parameter should fall within associated confidence intervals a set proportion of the time. This proportion is known as the coverage probability, and is most commonly set to be 95%.
When estimating a vector of parameters, c(θ), based on observations of some random variables whose distribution depends on those parameters in some way, a confidence interval (for scalar θ) or confidence region (for vector c(θ)), is some set C=C(X) such that P(c(θ) ∈ C) = 1−α. To note:
- The confidence interval is a function of the data, X, so is itself random.
- The statement regarding the probability that c(θ) ∈ C should be regarded with respect to the randomness in X which controls C. Since confidence intervals are a frequentist notion, one should not think of the probability as applying to the unobserved parameter c(θ), which, to a frequentist, is not random.
- Often one can only compute approximate confidence intervals, which may have the nominal coverage asymptotically in the sample size.
Tag usage
Questions on confidence-interval should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.