Questions tagged [hypothesis-test]

Functions used to choose between competing hypotheses about one or more probability distributions. For statistical questions, please use stats.stackexchange.com.

Common hypothesis tests include the one-sample and paired t-test for means, the z-test, which approximates the t-test for large samples, F-test for differences in variance, and Chi-square test for independence, and Fisher's exact test for differences in proportion.

Please note that this tag is totally different from , which refers to software testing.

349 questions
241
votes
10 answers

Find p-value (significance) in scikit-learn LinearRegression

How can I find the p-value (significance) of each coefficient? lm = sklearn.linear_model.LinearRegression() lm.fit(x,y)
elplatt
  • 3,227
  • 3
  • 18
  • 20
78
votes
3 answers

T-test in Pandas

If I want to calculate the mean of two categories in Pandas, I can do it like this: data = {'Category': ['cat2','cat1','cat2','cat1','cat2','cat1','cat2','cat1','cat1','cat1','cat2'], 'values': [1,2,3,1,2,3,1,2,3,5,1]} my_data =…
hirolau
  • 13,451
  • 8
  • 35
  • 47
28
votes
2 answers

Confidence Interval for t-test (difference between means) in Python

I am looking for a quick way to get the t-test confidence interval in Python for the difference between means. Similar to this in R: X1 <- rnorm(n = 10, mean = 50, sd = 10) X2 <- rnorm(n = 200, mean = 35, sd = 14) # the scenario is similar to my…
Anarcho-Chossid
  • 2,210
  • 4
  • 27
  • 44
9
votes
2 answers

What is the difference between Property Based Testing and Mutation testing?

My context for this question is in Python. Hypothesis Testing Library (i.e. Property Based Testing): https://hypothesis.readthedocs.io/en/latest/ Mutation Testing Library: https://github.com/sixty-north/cosmic-ray
9
votes
1 answer

Testing the equality of multiple coefficients in R

I have the following model: y = b1_group1*X1 + b1_group2*X1 + b2_group1*X2 + b2_group2*X2 + ... + b10_group1*X10 + b10_group2*X10 Easily made in R as follows: OLS <- lm(Y ~ t1:Group + t2:Group + t3:Group + t4:Group + t5:Group + t6:Group + …
user33125
  • 197
  • 1
  • 3
  • 12
9
votes
6 answers

Can R visualize the t.test or other hypothesis test results?

I need to work with many hypothesis tests in R and present the results. Here is an example: > library(MASS) > h=na.omit(survey$Height) > > pop.mean=mean(h) > h.sample = sample(h,30) > > t.test(h.sample,mu=pop.mean) One Sample t-test data: …
Allan Xu
  • 7,998
  • 11
  • 51
  • 122
9
votes
5 answers

How to write a loop to run the t-test of a data frame?

I met a problem of running a t-test for some data stored in a data frame. I know how to do it one by one but not efficient at all. May I ask how to write a loop to do it? For example, I have got the data in the testData: testData <-…
Samo Jerom
  • 2,361
  • 7
  • 32
  • 38
8
votes
1 answer

norm.ppf vs norm.cdf in python's scipy.stats

so i have pasted my complete code for your reference, i want to know what's the use of ppf and cdf here? can you explain it? i did some research and found out that ppf(percent point function) is an inverse of CDF(comulative distribution function) if…
Pushpak Ruhil
  • 176
  • 1
  • 1
  • 9
8
votes
4 answers

MCAR Little's test in Python

How can I execute Little's Test, to find MCAR in Python? I have looked at the R package for the same test, but I want to do it in Python. Is there an alternate approach to test MCAR?
8
votes
1 answer

Categorical variables usage in pandas for ANOVA and regression?

To prepare a little toy example: import pandas as pd import numpy as np high, size = 100, 20 df = pd.DataFrame({'perception': np.random.randint(0, high, size), 'age': np.random.randint(0, high, size), …
A T
  • 13,008
  • 21
  • 97
  • 158
7
votes
1 answer

Speeding up wilcox.test in R

I am currently trying to implement the Wilcoxon Ranksum test on multiple data sets that I've combined into one large matrix, A, that is 705x17635 (ie I want to run the ranksum test 17,635 times. The only way I've seen how to do this without using…
6
votes
3 answers

R T-Test from N/Mean/SD

I know that if I have a set of data, I can run t.test to do a T test. But I only know the count, mean and standard deviation for each set. I'm sure there must be a way to do this in R, but I can't figure it out. Any help?
Xodarap
  • 11,581
  • 11
  • 56
  • 94
6
votes
1 answer

Hypothesis Testing Skewness and/or Kurtosis in R

How do I specifically test the null and alternative hypothesis of the skewness and/or Kurtosis of a variable in hypothesis testing? Would I have to use a formula in t.test? t.test(data$variable, y = Null) Any help is appreciated. Thanks!
Starbucks
  • 1,448
  • 3
  • 21
  • 49
6
votes
1 answer

Kolmogorov-Smirnov test

I'm using the R function ks.test() to test the Uniform distribution of the R random number generator. I'm using the following code: replicate(100000, ks.test(runif(n),y="punif"). When n is less than or equal to 100 it works, but when n is greater…
Egodym
  • 453
  • 1
  • 8
  • 23
6
votes
2 answers

What's the fastest way to apply t.test to each column of a large matrix?

Suppose I have a large matrix: M <- matrix(rnorm(1e7),nrow=20) Further suppose that each column represents a sample. Say I would like to apply t.test() to each column, is there a way to do this that is much faster than using apply()? apply(M, 2,…
Alex
  • 4,030
  • 8
  • 40
  • 62
1
2 3
23 24