Questions tagged [confidence-interval]

In statistics, a confidence interval is a measure of the precision of an estimate of an underlying parameter. In principle, if the estimate is computed several times on independent samples of data, the parameter should fall within associated confidence intervals a set proportion of the time. This proportion is known as the coverage probability, and is most commonly set to be 95%.

When estimating a vector of parameters, c(θ), based on observations of some random variables whose distribution depends on those parameters in some way, a confidence interval (for scalar θ) or confidence region (for vector c(θ)), is some set C=C(X) such that P(c(θ) ∈ C) = 1−α. To note:

  1. The confidence interval is a function of the data, X, so is itself random.
  2. The statement regarding the probability that c(θ) ∈ C should be regarded with respect to the randomness in X which controls C. Since confidence intervals are a frequentist notion, one should not think of the probability as applying to the unobserved parameter c(θ), which, to a frequentist, is not random.
  3. Often one can only compute approximate confidence intervals, which may have the nominal coverage asymptotically in the sample size.

Tag usage

Questions on should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis.

1131 questions
181
votes
6 answers

Compute a confidence interval from sample data

I have sample data which I would like to compute a confidence interval for, assuming a normal distribution. I have found and installed the numpy and scipy packages and have gotten numpy to return a mean and standard deviation (numpy.mean(data) with…
Bmayer0122
  • 2,138
  • 2
  • 14
  • 7
76
votes
2 answers

Confidence intervals for predictions from logistic regression

In R predict.lm computes predictions based on the results from linear regression and also offers to compute confidence intervals for these predictions. According to the manual, these intervals are based on the error variance of fitting, but not on…
unique2
  • 2,162
  • 2
  • 18
  • 23
59
votes
1 answer

How to calculate the 95% confidence interval for the slope in a linear regression model in R

Here is an exercise from Introductory Statistics with R: With the rmr data set, plot metabolic rate versus body weight. Fit a linear regression model to the relation. According to the fitted model, what is the predicted metabolic rate for a body…
Yu Fu
  • 1,151
  • 1
  • 8
  • 15
52
votes
3 answers

Correct way to obtain confidence interval with scipy

I have a 1-dimensional array of data: a = np.array([1,2,3,4,4,4,5,5,5,5,4,4,4,6,7,8]) for which I want to obtain the 68% confidence interval (ie: the 1 sigma). The first comment in this answer states that this can be achieved using…
Gabriel
  • 40,504
  • 73
  • 230
  • 404
37
votes
4 answers

Python function to get the t-statistic

I am looking for a Python function (or to write my own if there is not one) to get the t-statistic in order to use in a confidence interval calculation. I have found tables that give answers for various probabilities / degrees of freedom like this…
ChrisProsser
  • 12,598
  • 6
  • 35
  • 44
37
votes
2 answers

scikit-learn - ROC curve with confidence intervals

I am able to get a ROC curve using scikit-learn with fpr, tpr, thresholds = metrics.roc_curve(y_true,y_pred, pos_label=1), where y_true is a list of values based on my gold standard (i.e., 0 for negative and 1 for positive cases) and y_pred is a…
user2836189
  • 373
  • 1
  • 4
  • 4
33
votes
4 answers

How can I plot a confidence interval in Python?

I recently started to use Python, and I can't understand how to plot a confidence interval for a given datum (or set of data). I already have a function that computes, given a set of measurements, a higher and lower bound depending on the confidence…
Luigi2405
  • 677
  • 2
  • 7
  • 12
25
votes
2 answers

Extract prediction band from lme fit

I have following model x <- rep(seq(0, 100, by=1), 10) y <- 15 + 2*rnorm(1010, 10, 4)*x + rnorm(1010, 20, 100) id <- NULL for(i in 1:10){ id <- c(id, rep(i,101)) } dtfr <- data.frame(x=x,y=y, id=id) library(nlme) with(dtfr, summary( lme(y~x,…
ECII
  • 10,297
  • 18
  • 80
  • 121
24
votes
4 answers

Confidence interval for binomial data in R?

I know that I need mean and s.d to find the interval, however, what if the question is: For a survey of 1,000 randomly chosen workers, 520 of them are female. Create a 95% confidence interval for the proportion of workers who are female based on…
Pig
  • 2,002
  • 5
  • 26
  • 42
23
votes
2 answers

is seaborn confidence interval computed correctly?

First, I must admit that my statistics knowledge is rusty at best: even when it was shining new, it's not a discipline I particularly liked, which means I had a hard time making sense of it. Nevertheless, I took a look at how the barplot graphs were…
anarcat
  • 5,605
  • 4
  • 32
  • 38
21
votes
5 answers

How to use norm.ppf()?

I couldn't understand how to properly use this function, could someone please explain it to me? Let's say I have: a mean of 172.7815 a standard deviation of 4.1532 N = 50 (50 samples) When I'm asked to calculate the (95%) margin of error using…
17
votes
2 answers

How to plot a time series array, with confidence intervals displayed, in python?

I have some time series which slowly increases, but over a short period of time they are very wavy. For example, the time series could look like: [10 + np.random.rand() for i in range(100)] + [12 + np.random.rand() for i in range(100)] + [14 +…
Ștefan
  • 773
  • 2
  • 7
  • 19
17
votes
1 answer

Plotting confidence intervals with NA values

I would like to plot confidence intervals to a data with NAs, using Gviz package. I modified manual example to expose my problem. First as the manual expose: library(Gviz) ## Loading GRanges object data(twoGroups) ## Plot data without NAs dTrack…
user2120870
  • 869
  • 4
  • 16
17
votes
2 answers

Get 95% confidence interval with glm(..) in R

Here are some data dat = data.frame(y = c(9,7,7,7,5,6,4,6,3,5,1,5), x = c(1,1,2,2,3,3,4,4,5,5,6,6), color = rep(c('a','b'),6)) and the plot of these data if you wish require(ggplot) ggplot(dat, aes(x=x,y=y, color=color)) + geom_point() +…
Remi.b
  • 17,389
  • 28
  • 87
  • 168
16
votes
2 answers

Confidence interval of probability prediction from logistic regression statsmodels

I'm trying to recreate a plot from An Introduction to Statistical Learning and I'm having trouble figuring out how to calculate the confidence interval for a probability prediction. Specifically, I'm trying to recreate the right-hand panel of this…
Taylor
  • 378
  • 2
  • 4
  • 14
1
2 3
75 76