0

I have noticed weird behaviour of lm() more specifically that the t.values do not work out. This behaviour is only observable on my machine, unregarding the loaded packages/objects in the global environment. Running the example from help(t.test):

t.test(extra ~ group, data = sleep, var.equal = TRUE)

yields the following results:

## 
##  Two Sample t-test
## 
## data:  extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.363874  0.203874
## sample estimates:
## mean in group 1 mean in group 2 
##            0.75            2.33

While the "same" thing as lm():

summary(lm(extra ~ group, data = sleep))

yields:

## 
## Call:
## lm(formula = extra ~ group, data = sleep)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.0095  -0.1152   1.3117   3.4194  11.4571 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   -1.488      2.028  -0.734   0.4725  
## group2         4.962      2.147   2.311   0.0329 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.823 on 18 degrees of freedom
## Multiple R-squared:  0.3699, Adjusted R-squared:  0.3349 
## F-statistic: 10.57 on 1 and 18 DF,  p-value: 0.00444

t.test t.value: -1.8608 vs. lm t.value: 2.311 What are possible reasons for this descriptency?

This code was run in a Rmarkdown (therefore in a new session) without any other code run beforehand.

Session Info

sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-solus-linux-gnu (64-bit)
## Running under: Solus 4.0 Fortitude
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib64/haswell/libopenblas_haswellp-r0.3.2.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.6.1  magrittr_1.5    tools_3.6.1     htmltools_0.4.0
##  [5] yaml_2.2.0      Rcpp_1.0.3      stringi_1.4.3   rmarkdown_1.17 
##  [9] knitr_1.25      stringr_1.4.0   xfun_0.10       digest_0.6.23  
## [13] rlang_0.4.2     evaluate_0.14

Other "Machine"

I have tried the same thing in a docker container (rocker/verse:3.6.1, same R-Version as my machine) which yields consistent results:

t.test(extra ~ group, data = sleep, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  extra by group
## t = -1.8608, df = 18, p-value = 0.07919
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.363874  0.203874
## sample estimates:
## mean in group 1 mean in group 2 
##            0.75            2.33

summary(lm(extra ~ group, data = sleep))
## 
## Call:
## lm(formula = extra ~ group, data = sleep)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -2.430 -1.305 -0.580  1.455  3.170 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   0.7500     0.6004   1.249   0.2276  
## group2        1.5800     0.8491   1.861   0.0792 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.899 on 18 degrees of freedom
## Multiple R-squared:  0.1613, Adjusted R-squared:  0.1147 
## F-statistic: 3.463 on 1 and 18 DF,  p-value: 0.07919

sessionInfo()
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Debian GNU/Linux 9 (stretch)
## 
## Matrix products: default
## BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.6.1  magrittr_1.5    tools_3.6.1     htmltools_0.4.0
##  [5] yaml_2.2.0      Rcpp_1.0.3      stringi_1.4.3   rmarkdown_1.16 
##  [9] knitr_1.25      stringr_1.4.0   xfun_0.10       digest_0.6.22  
## [13] rlang_0.4.1     evaluate_0.14

As far as I can discern it the only difference is the BLAS/LAPACK version.

Ulrich Eckhardt
  • 16,572
  • 3
  • 28
  • 55
AaronP
  • 185
  • 10
  • 1
    can you check there's no other variables in your environment? For the first lm() fit, you get intercept coefficient = -1.488 and group2 = 4.962 . This more or less means you have mean or -1.488 in group1 and 4.962 in group2. This is impossible if we are talking about the same data being fed into the t.test – StupidWolf Nov 30 '19 at 10:48
  • @StupidWolf This is directly out of a fresh R-Session. Nothing in my Environment. I am just puzzled. I have repeated this after restarting my machine and R-Session repeatedly without restoring my environment, even reinstalled R & RStudio – AaronP Nov 30 '19 at 10:53
  • Hmm Aaron, I get the same output as the "Other Machine".. the output that makes sense.. – StupidWolf Nov 30 '19 at 10:58

1 Answers1

0

I struggled with the same thing recently on my SolusOS setup and also posted about it here. To me the OpenBLAS library seems to be the culprit, namely the libopenblas_haswellp-r0.3.2.so. As soon as the library was changed to another one, in my case libopenblas_core2p-r0.3.2.so, I started getting the correct results on my SolusOS setup.

(I actually just edited my own post to include this info as well)

Voltti
  • 96
  • 6