Confidence Interval for t-test (difference between means) in Python

Question

I am looking for a quick way to get the t-test confidence interval in Python for the difference between means. Similar to this in R:

X1 <- rnorm(n = 10, mean = 50, sd = 10)
X2 <- rnorm(n = 200, mean = 35, sd = 14)
# the scenario is similar to my data

t_res <- t.test(X1, X2, alternative = 'two.sided', var.equal = FALSE)    
t_res

Out:

    Welch Two Sample t-test

data:  X1 and X2
t = 1.6585, df = 10.036, p-value = 0.1281
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.539749 17.355816
sample estimates:
mean of x mean of y 
 43.20514  35.79711

>> print(c(t_res$conf.int[1], t_res$conf.int[2]))
[1] -2.539749 17.355816

I am not really finding anything similar in either statsmodels or scipy, which is strange, considering the importance of significance intervals in hypothesis testing (and how much criticism the practice of reporting only the p-values recently got).

I tagged it both; maybe folks who use R know the answer for Python. Nowadays a lot of people use both. — Anarcho-Chossid, Aug 02 '15 at 04:17
It's available in statsmodels, but doesn't have a very convenient interface http://www.statsmodels.org/stable/generated/statsmodels.stats.weightstats.CompareMeans.html — Josef, Aug 02 '15 at 04:38
Quite a few SO questions give examples, please take a look to [t test](http://stackoverflow.com/questions/2324438/how-to-calculate-the-statistics-t-test-with-numpy) and [confidence interval](http://stackoverflow.com/questions/15033511/compute-a-confidence-interval-from-sample-data) — lrnzcig, Aug 03 '15 at 19:55
I looked at quite a few SO examples, and none of them address precisely what I want to do. I need to calculate a confidence interval for a t-test of difference between means, not t-test describing my data. — Anarcho-Chossid, Aug 03 '15 at 22:45
Also see [this answer](https://stats.stackexchange.com/a/475345/241268) for how to code it manually using `numpy`. `scipy` and `pandas`. — Warm_Duscher, Aug 26 '22 at 17:30

score 38 · Accepted Answer · answered Dec 29 '15 at 18:07

38

Here how to use StatsModels' CompareMeans to calculate the confidence interval for the difference between means:

import numpy as np, statsmodels.stats.api as sms

X1, X2 = np.arange(10,21), np.arange(20,26.5,.5)

cm = sms.CompareMeans(sms.DescrStatsW(X1), sms.DescrStatsW(X2))
print cm.tconfint_diff(usevar='unequal')

Output is

(-10.414599391793885, -5.5854006082061138)

and matches R:

> X1 <- seq(10,20)
> X2 <- seq(20,26,.5)
> t.test(X1, X2)

    Welch Two Sample t-test

data:  X1 and X2
t = -7.0391, df = 15.58, p-value = 3.247e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -10.414599  -5.585401
sample estimates:
mean of x mean of y 
       15        23

answered Dec 29 '15 at 18:07

Ulrich Stern

10,761
5
55
76

hey @ulrich-stern, thanks for your answer. I wonder if this CI is for relative difference or absolute difference? Do you know how can we calculate CI for relative differences? – CanCeylan Mar 06 '18 at 11:33
@CanCeylan, my answer is for the "regular" difference. There is a [Cross Validated question](https://stats.stackexchange.com/q/264929/112208) that suggests the bootstrap in case of relative differences. – Ulrich Stern Mar 07 '18 at 15:18

score 0 · Answer 2 · answered Aug 26 '22 at 16:01

An alternate answer using pingouin (basically copied code from here and adapted to use Ulrich Stern's variables)

import pingouin as pg
x1, x2 = np.arange(10,21), np.arange(20,26.5,.5)
res = pg.ttest(x1, x2, paired=False)
print(res)

prints

            T    dof       tail     p-val            CI95%  cohen-d       BF10  power
T-test -7.039  15.58  two-sided  0.000003  [-10.41, -5.59]    3.009  2.251e+04    1.0

Confidence Interval for t-test (difference between means) in Python

2 Answers2