Questions tagged [p-value]

In statistical significance testing the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed.

The p-value is a key concept in the approach of Ronald Fisher, where he uses it to measure the weight of the data against a specified hypothesis, and as a guideline to ignore data that does not reach a specified significance level. Fisher's approach does not involve any alternative hypothesis, which is instead the Neyman–Pearson approach. The p-value should not be confused with the Type I error rate (false positive rate) α in the Neyman–Pearson approach – though α is also called a "significance level" and is often 0.05, these terms have different meanings, these are incompatible approaches, and the numbers p and α cannot meaningfully be compared.

791 questions
57
votes
6 answers

Put stars on ggplot barplots and boxplots - to indicate the level of significance (p-value)

It's common to put stars on barplots or boxplots to show the level of significance (p-value) of one or between two groups, below are several examples: The number of stars are defined by p-value, for example one can put 3 stars for p-value < 0.001,…
Ali
  • 9,440
  • 12
  • 62
  • 92
52
votes
5 answers

Extract pvalue from glm

I'm running many regressions and am only interested in the effect on the coefficient and p-value of one particular variable. So, in my script, I'd like to be able to just extract the p-value from the glm summary (getting the coefficient itself is…
ch-pub
  • 1,664
  • 6
  • 29
  • 52
32
votes
5 answers

Calculating adjusted p-values in Python

So, I've been spending some time looking for a way to get adjusted p-values (aka corrected p-values, q-values, FDR) in Python, but I haven't really found anything. There's the R function p.adjust, but I would like to stick to Python coding, if…
erikfas
  • 4,357
  • 7
  • 28
  • 36
31
votes
7 answers

Stepwise regression using p-values to drop variables with nonsignificant p-values

I want to perform a stepwise linear Regression using p-values as a selection criterion, e.g.: at each step dropping variables that have the highest i.e. the most insignificant p-values, stopping when all values are significant defined by some…
DainisZ
  • 465
  • 1
  • 7
  • 8
23
votes
5 answers

Extract Regression P Value in R

I am performing multiple regressions on different columns in a query file. I've been tasked with extracting certain results from the regression function lm in R. So far I have, > reg <- lm(query$y1 ~ query$x1 + query$x2) >…
Harmzy15
  • 487
  • 3
  • 6
  • 15
18
votes
3 answers

Python sklearn - how to calculate p-values

This is probably a simple question but I am trying to calculate the p-values for my features either using classifiers for a classification problem or regressors for regression. Could someone suggest what is the best method for each case and provide…
user1096808
  • 253
  • 1
  • 4
  • 11
14
votes
1 answer

ggplot2: add p-values to the plot

I got this plot Using the code below library(dplyr) library(ggplot2) library(ggpmisc) df <- diamonds %>% dplyr::filter(cut%in%c("Fair","Ideal")) %>% dplyr::filter(clarity%in%c("I1" , "SI2" , "SI1" , "VS2" , "VS1", "VVS2")) %>% …
shiny
  • 3,380
  • 9
  • 42
  • 79
13
votes
3 answers

How to get survdiff returned p value

I am using R survival package, survdiff function. I wonder how to get the p value from the return value. > diff = survdiff(Surv(Time, Censored) ~ Treatment+Gender, data = dat) > diff Call: survdiff(formula = Surv(Time, Censored) ~ Treatment +…
tsznxyz
  • 199
  • 1
  • 7
12
votes
2 answers

Calculation p-values of a f-statistic with R

I'm trying to calculate p-values of a f-statistic with R. The formula R uses in the lm() function is equal to (e.g. assume x=100, df1=2, df2=40): pf(100, 2, 40, lower.tail=F) [1] 2.735111e-16 which should be equal to 1-pf(100, 2, 40) [1]…
cjena
  • 307
  • 1
  • 3
  • 9
10
votes
1 answer

Finding Two-Tailed P Value from t-distribution and Degrees of Freedom in Python

How do I determine the P Value of a t-distrobution with n degrees of freedom. Research on this subject points me to this stack exchange answer: https://stackoverflow.com/a/17604216 I assume np.abs(tt) is the T-value, but how do i work in degrees…
hoshi
  • 137
  • 1
  • 1
  • 8
8
votes
6 answers

Adjust p-values for multiple comparisons in Matlab

I have a cell array of p-values that have to be adjusted for multiple comparisons. How can I do that in Matlab? I can't find a built-in function. In R I would do: data.pValue_adjusted = p.adjust(data.pValue, method='bonferroni') Is there a similiar…
Martin Preusse
  • 9,151
  • 12
  • 48
  • 80
8
votes
2 answers

How to get the p-value between two groups after groupby in pandas?

I am stuck on how to apply the custom function to calculate the p-value for two groups obtained from pandas groupby. vocabulary test = 0 ==> test test = 1 ==> control problem setup import numpy as np import pandas as pd import scipy.stats as…
BhishanPoudel
  • 15,974
  • 21
  • 108
  • 169
7
votes
1 answer

p-values from ridge regression in python

I'm using ridge regression (ridgeCV). And I've imported it from: from sklearn.linear_model import LinearRegression, RidgeCV, LarsCV, Ridge, Lasso, LassoCV How do I extract the p-values? I checked but ridge has no object called summary. I couldn't…
7
votes
2 answers

How to manually compute the p-value of t-statistic in linear regression

I did a linear regression for a two tailed t-test with 178 degrees of freedom. The summary function gives me two p-values for my two t-values. t value Pr(>|t|) 5.06 1.04e-06 *** 10.09 < 2e-16 *** ... ... F-statistic: 101.8 on 1 and 178 DF,…
Frosi
  • 177
  • 5
  • 12
7
votes
2 answers

How to efficiently get the correlation matrix (with p-values) of a data frame with NaN values?

I am trying to compute a matrix of correlation, and filter the correlations based on the p-values to find out the highly correlated pairs. To explain what I mean, say I have a data frame like this. df A B C D 0 2 NaN…
ju.
  • 1,016
  • 1
  • 13
  • 34
1
2 3
52 53