1

I decided to try my hand writing a simple custom function doing a t-test on some lm()-produced regression's estimators (for example, H_0: Beta_j = "some constant" vs H_1: Beta_j < "some constant").

This is my first time creating my own code, but I have been working with R for a few months and I think I have a decent understanding of it, so I do not understand why I keep getting a "subscript out of bounds" on running it.

My code:

custom_test<-function(data,coeff,alt,alternative=c("two.sided","greater","less"),clevel=.95){
  dof<-data$df.residual
  top<-data$coefficients["coeff"]-alt
  bottom=coef(summary(data))["coeff","Std. Error"]
  stat<-abs(top/bottom)
  if (alternative=="two.sided") {
    tstat<-qt(clevel/2,dof)
    pstat<-2*pt(tstat,dof)
    return(pstat)
  } else if (alternative=="greater") {
      tstat<-qt(clevel/2,dof)
      pstat<-pt(tstat,dof)
      return(pstat)
  } else if (alternative=="less") {
      tstat<-qt(clevel/2,dof)
      pstat<-pt(tstat,dof)
      return(pstat)
  } else {
      return("Error")
  }

}

And I try to run this with standard lm() results, hrsemp being a var., and get the error:

custom_test(fit9,hrsemp,0,alternative="less")
Error in coef(summary(data))["coeff", "Std. Error"] : 
  subscript out of bounds

But everytime I run the problematic code manually myself, I do get an answer:

> coef(fit9)
(Intercept)      hrsemp  log(sales) log(employ) 
12.45837237 -0.02926893 -0.96202698  0.76147045 
> coef(summary(fit9))["hrsemp", "Std. Error"]
[1] 0.02280484

Other Stack Exchange questions regarding this error all seem to be subtly different, and I haven't been able to generalize their lessons to my code thus far.

Where am I going wrong?

halfer
  • 19,824
  • 17
  • 99
  • 186
Coolio2654
  • 1,589
  • 3
  • 21
  • 46
  • You should make sure you provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) so we can also run and test the function. Right now we are unsure what you are actually passing to the function. – MrFlick Sep 29 '17 at 19:18
  • 1
    The error says that either "coeff" is not a rowname or "Std. Error" is not a column name in the object you're indexing. Read `?Extract` -- you passed an invalid index, the same as any other SO posters who hit this error. – Frank Sep 29 '17 at 19:24
  • I edited my question above. I looked at what you said, but if I type out the problematic portion of my code manually, I don't get a problem, but for some reason in my code it does give me that error. Why could this be? – Coolio2654 Sep 29 '17 at 21:10

1 Answers1

1

Frank is right; you're getting this error for the same reason at base as everyone else who does: You've tried to access an element of an object that doesn't exist. More specifically, in your case, you're trying to access an element in the "coeff" row and "Std. Error" column of coef(summary(data)). This is a problem because there will likely be no coefficient named "coeff". You want to do the following:

custom_test<-function(data,coeff,alt,alternative=c("two.sided","greater","less"),clevel=.95){
    dof<-data$df.residual
    top<-data$coefficients[coeff]-alt
    bottom=coef(summary(data))[coeff,"Std. Error"]
    stat<-abs(top/bottom)
    if (alternative=="two.sided") {
        tstat<-qt(clevel/2,dof)
        pstat<-2*pt(tstat,dof)
        return(pstat)
    } else if (alternative=="greater") {
        tstat<-qt(clevel/2,dof)
        pstat<-pt(tstat,dof)
        return(pstat)
    } else if (alternative=="less") {
        tstat<-qt(clevel/2,dof)
        pstat<-pt(tstat,dof)
        return(pstat)
    } else {
        return("Error")
    }

}

and pass the variable name as a string:

set.seed(42)
hrsemp <- rnorm(10)
Y <- 1 + 5 * hrsemp + rnorm(10)
fit9 <- lm(Y ~ hrsemp)
custom_test(fit9, 'hrsemp', 0, alternative="less")
[1] 0.475

(Note you could alternatively feed the function the actual variable object and use deparse(substitute(coeff)) -- for example, see this SO question).

However, you may notice this gives you the wrong answer. That's because you've written your function incorrectly. You probably want something more like this:

custom_test <- function(data, coeff, alt,
                        alternative = c("two.sided", "greater", "less"),
                        clevel = .95){
    dof <- data$df.residual
    top <- data$coefficients[coeff] - alt
    bottom <- coef(summary(data))[coeff, "Std. Error"]
    stat <- abs(top/bottom)
    if ( alternative == "two.sided" ) {
        return(2 * (1 - pt(stat, dof)))
    } else if ( alternative == "greater" ) {
        return(1 - pt(stat, dof))
    } else if ( alternative == "less" ) {
        return(1 - pt(stat, dof))
    } else {
        stop("Provide a valid alternative hypothesis.", call.=FALSE)
    }
}


custom_test(fit9, 'hrsemp', 0, alternative="less")
      hrsemp 
7.858176e-05 
custom_test(fit9, 'hrsemp', 0, alternative="two.sided")
      hrsemp 
0.0001571635 
coef(summary(fit9))['hrsemp', 'Pr(>|t|)']
[1] 0.0001571635

One of many good explanations of why this is the right calculation can be found here.

duckmayr
  • 16,303
  • 3
  • 35
  • 53
  • I was deep-down suspecting the double-quotes were throwing off how var. names were being input into the code. Thank you so much for clarifying that for me; someone to explain the nuance of the formatting helped me out tremendously. You even correcting my amateurishly written t-test. I will use your code to figure out how I can build the t-test myself. To advancement! – Coolio2654 Sep 30 '17 at 18:34
  • No problem, glad it was helpful. You'll get better at noticing the little things like that with time. Good luck! – duckmayr Sep 30 '17 at 18:54