Different P-values between wilcox_test() & wilcox.test() in R

Question

While I was trying to perform a paired Wilcoxon test for the variable "Metabolite", I noticed different p-values between wilcox_test() & wilcox.test() when I tested the following variable:

structure(list(Visit = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L), .Label = c("BSL", "LV"), class = "factor"), Metabolite = c(NA, 
9.602, 9.0102, 4.524, 3.75, NA, 6.596, 7.065, 6.877, NA, NA, 
10.1846, 13.521, 7.8219, NA, 4.9149, 4.0754, 4.7635, 8.8554, 
4.3442, NA, 16.659, NA, 3.698, 6.623, 5.158, 11.719, 3.206, NA, 
2.225, 7.417, 1.42, NA, NA, 2.752, 6.504, 7.594, 6.652, NA, NA, 
3.784, 2.7311, 4.1749, 2.6659, 0.5592, NA, 4.2326, 4.3808, 3.624, 
4.29, 7.098, 6.532, 3.699, 9.297, 8.275, NA)), class = "data.frame", row.names = c(NA, 
-56L))



# The p-value derived from result 1 (p=0.0079) is different from that from result 2 (p=0.003279):


        result1 <- wilcox_test(data=Data_pairs,  Metabolite~Visit, paired = TRUE)
        
        # A tibble: 1 x 7
        #.y.     group1 group2    n1    n2 statistic      p
        #* <chr>   <chr>  <chr>  <int> <int>     <dbl>  <dbl>
        #        1 Metabolite BSL    LV        28    28       131 0.0079
        
        
        result2 <- wilcox.test(data=Data_pairs,  Metabolite~Visit, paired = TRUE)
        #   Wilcoxon signed rank exact test
        
        #data:  Metabolite by Visit
        #V = 197, p-value = 0.003279
        #alternative hypothesis: true location shift is not equal to 0

Using different statistical software, it seems that the wrong p-value is that derived from result2.

Is there any suggestion/advice on how to correct my code and what is the reason for this difference?

Thank you in advance for your help.

1) your `result1` is not reproducible; 2) `coin::wilcox_test` is not a paired test and does not have a `paired` argument; 3) see note in `?wilcox.test`: `The formula interface is only applicable for the 2-sample tests.` — rawr, Dec 03 '21 at 16:57
wilcox_test is from rstatix, not coin, think OP should clarify that. — StupidWolf, Dec 04 '21 at 10:25

StupidWolf · Answer 1 · 2021-12-04T10:26:28.860

You should point out you are using rstatix to avoid the confusion:

rstatix::wilcox_test(data=Data_pairs,  Metabolite~Visit, paired = TRUE)
# A tibble: 1 x 7
  .y.        group1 group2    n1    n2 statistic      p
* <chr>      <chr>  <chr>  <int> <int>     <dbl>  <dbl>
1 Metabolite BSL    LV        28    28       131 0.0079

If it is indeed paired data, then you need to ensure you provide the complete pairs, taking your data, we can place them side by side, assuming the your observations 1 to 28 and from the same sample as 29 to 56:

da = do.call(cbind,split(Data_pairs$Metabolite,Data_pairs$Visit))
da

          BSL     LV
 [1,]      NA     NA
 [2,]  9.6020 2.2250
 [3,]  9.0102 7.4170
 [4,]  4.5240 1.4200
 [5,]  3.7500     NA
 [6,]      NA     NA
 [7,]  6.5960 2.7520
 [8,]  7.0650 6.5040
 [9,]  6.8770 7.5940
[10,]      NA 6.6520
[11,]      NA     NA
[12,] 10.1846     NA
[13,] 13.5210 3.7840
[14,]  7.8219 2.7311
[15,]      NA 4.1749
[16,]  4.9149 2.6659
[17,]  4.0754 0.5592
[18,]  4.7635     NA
[19,]  8.8554 4.2326
[20,]  4.3442 4.3808
[21,]      NA 3.6240
[22,] 16.6590 4.2900
[23,]      NA 7.0980
[24,]  3.6980 6.5320
[25,]  6.6230 3.6990
[26,]  5.1580 9.2970
[27,] 11.7190 8.2750
[28,]  3.2060     NA

We test only the complete:

wilcox.test(da[complete.cases(da),1],da[complete.cases(da),2],paired=TRUE)

data:  da[complete.cases(da), 1] and da[complete.cases(da), 2]
V = 131, p-value = 0.007904
alternative hypothesis: true location shift is not equal to 0

I am sorry for not pointing out that the package was: rstatix. Yes, my observations 1 to 28 are paired with the 29 to 56. I have a dataset of >10.000 metabolites, and therefore my dataset is in a long format; thus, I wondered if the wilcox.test() applies to this type of data structure. Thank you all very much for your help. — Geli, Dec 10 '21 at 04:27

Different P-values between wilcox_test() & wilcox.test() in R

1 Answers1