0

I am new to r and I am a function written to perform Cochran's Q test for homogeneity. I am trying to find the p value but the function returns a std. error instead of a p value as you can see below. I fear I may have written the code incorrectly. Please help if you can.

dput(g1):

c(Estimate = 0.89502349171762, `Std. Error` = 0.198551047700126, 
`z value` = 4.50777521491291, `Pr(>|z|)` = 6.55109263453831e-06

dput(g2):

c(Estimate = 0.798619626770536, `Std. Error` = 0.0963490218317536, 
`z value` = 8.28881924888765, `Pr(>|z|)` = 1.14378204173144e-16

dput(g3):

c(Estimate = 0.493965082375263, `Std. Error` = 0.27251885149614, 
`z value` = 1.81259050397201, `Pr(>|z|)` = 0.0698950035163995    
Qtest <- function(g1,g2,g3){
   w1 <- 1/(g1["Std. Error"])^2
   w2 <- 1/(g2["Std. Error"])^2
   w3 <- 1/(g3["Std. Error"])^2
   avg <- (w1*g1["Estimate"] + w2*g2["Estimate"]+ w3*g3["Estimate"])/(w1+w2+w3)
   test <- w1*(g1["Estimate"]-avg)^2 + w2*(g2["Estimate"] - avg)^2+ w3*(g3["Estimate"] - avg)^2
   pval <- 1-pchisq(test, df=1)
   return(pval)
 }
 Qtest(g1,g2,g3)
 # Std. Error 
 #  0.225857 
user2554330
  • 37,248
  • 4
  • 43
  • 90
jayray23
  • 3
  • 2
  • Please share your data in a [reproducible format](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) (use `dput(g1)` rather than showing the printed result). The result is probably just keeping the name from the input vector. Try `return(unname(pval))`. R has no way of knowing if what you are calculating is a p-value or a standard error – MrFlick Sep 03 '20 at 19:44
  • Aside: there’s no reason to use `return` here; in R, every expression has a value, and the value of a `{}`-sequence is the value of the last expression inside it. Consequently, the value of a function is the value of the last expression of the function when called. — This means that instead of `pval <- 1-pchisq(test, df=1)` and `return(pval)` you only need to write `1 - pchisq(test, df=1)`. – Konrad Rudolph Sep 03 '20 at 20:37

1 Answers1

0

The value returned by this function is a named vector of length 1. The vector is named purely by accident: your calculation is itself using named vectors (g1, g2 and g3), and subsetting these vectors does not remove the name.

So when you write g1["Std. Error"] then that value is also named:

 〉g1["Std. Error"]
Std. Error
  0.198551

And the name does not automatically go away during the subsequent calculations.

But the name does not otherwise carry meaning: you can just ignore it; the value returned by that function is the value you calculated — in other words, it’s the p-value of rejecting the hypothesis tested by Cochran’s Q test.

You can remove the bothersome name inside the function via unname:

Qtest <- function (g1, g2, g3) {
    w1 <- 1 / g1["Std. Error"] ^ 2
    w2 <- 1 / g2["Std. Error"] ^ 2
    w3 <- 1 / g3["Std. Error"] ^ 2
    avg <- (w1 * g1["Estimate"] + w2 * g2["Estimate"]+ w3 * g3["Estimate"]) / (w1 + w2 + w3)
    test <- w1 * (g1["Estimate"] - avg) ^ 2 + w2 * (g2["Estimate"] - avg) ^ 2 + w3 * (g3["Estimate"] - avg) ^ 2
    unname(1 - pchisq(test, df = 1))
}

I’ve also added some whitespace to make the function more readable; and removed the redundant return function call — but otherwise the function is unchanged. There are other ways of rewriting the function that take advantage of R’s vectorised operations. For instance, I’d be tempted to rewrite it as follows:

Qtest = function (g) {
    w = 1 / g[, 'Std. Error'] ^ 2
    avg = sum(w * g[, 'Estimate']) / sum(w)
    test = sum(w * (g[, 'Estimate'] - avg) ^ 2)
    1 - pchisq(test, df = 1)
}

This function expects a matrix of results:

Qtest(rbind(g1, g2, g3))

(That said, I’m not convinced that this implementation is correct — at the very least it’s not general, since the degrees of freedom are fixed at 1.)

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214