I am running into an issue in R and not quite sure what it happening. When I run a regression and a t.test
on the same variables I find that the t.test
is dropping ~100 participants (the DF is 283.93 for the t-test and 382 for the regression), giving me different pvalues. However, if I compute the means separately for the full sample, they are the same as showing in the t-test.
Can anyone explain what might be happening? Below is the code and output for both the regression and the t-test. Note that the DV is a 1 to 7 variable and the IV is a 1/0 dummy.
The regression output
Call:
lm(formula = confident ~ get.surgery, data = d)
Residuals:
Min 1Q Median 3Q Max
-4.2989 -0.7767 0.2233 0.7011 1.7011
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.29893 0.07714 68.692 < 2e-16 ***
get.surgery 0.47777 0.14895 3.208 0.00145 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.293 on 382 degrees of freedom
Multiple R-squared: 0.02623, Adjusted R-squared: 0.02368
F-statistic: 10.29 on 1 and 382 DF, p-value: 0.001451
and the t-test
t.test(confident ~ get.surgery, data = d)
Welch Two Sample t-test
data: confident by get.surgery
t = -3.6106, df = 233.93, p-value = 0.0003737
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.7384624 -0.2170709
sample estimates:
mean in group 0 mean in group 1
5.298932 5.776699