0

Q: How do I subset a subset of a changing variable inside a function or loop?

Assume I have the following code for determining regression stats for multiple data sets :

dat1 <- data.frame(col1=1:5,col2=6:10)
dat2 <- data.frame(col1=11:15,col2=16:20)
func <- function(data,col.no){
  get(data)[,col.no]
}

for(i in c('dat1','dat2')) {
  mod.name <- paste0('fit.',i)
  assign(mod.name,lm(get(i)[,1]~get(i)[,2]),envir = .GlobalEnv)
}

mod.pvals <- NULL
p.func <- function(attr.name) {
  for(i in c('dat1','dat2')) {
    mod.name <- paste0('fit.',i)
    p.val <- summary(get(mod.name))[attr.name]
    mod.pvals <- c(mod.pvals,p.val)
  }
mod.pvals
}

r.vals <- p.func('r.squared')
adj.r.vals <- p.func('adj.r.squared')
coef.vals <-  p.func('coefficients')

This works just fine with 'r.squared','adj.r.squared',etc. But I want to access the p-value of the model in the same function.

Outside of a function I'd choose:

summary(fit)$coefficients[2,4] 

But how do I do this inside of the function??

I unsuccessfully tried:

summary(get(mod.name))['coefficients'][2,4]
Error in summary(get(mod.name))["coefficients"][[2, 4]] : 
  incorrect number of subscripts

So then I thought about just changing my code for p.val in the function above:

p.val <- paste0('summary(',mod.name ,')$',attr.name)
get(p.val)

But when I run the code I get the following error:

p.vals <- p.func('coefficients[2,4]')
Error in get(p.val) : 
  object 'summary(fit.dat1)$coefficients[2,4]' not found

I guess get() doesn't work like this. Is there a function that I can replace get() with?

Other thoughts on how I could make this work??

theforestecologist
  • 4,667
  • 5
  • 54
  • 91
  • Models don't have p-values. Comparisons between models have p-values. You also need to clarify whether you are interested in a p-value for a model coefficient (which is implicitly comparing to a model without that coefficient, versus a p-value for a model versus a null model. – IRTFM Nov 24 '15 at 00:15
  • And note that the `$` operator is equivalent to the used of `[[` ... not `[` . – IRTFM Nov 24 '15 at 00:20
  • @42: Oh, you're right. `summary(get(mod.name))[['coefficients']][1,4]` would make it work. Again, however, only in an isolated sense. I'm wondering if I can do the same thing without having to add the extra [#,#] to my code that all of the other attribute names do not require. – theforestecologist Nov 24 '15 at 01:14
  • This would be much cleaner code of you created a list of dataframes and worked on that object. The use of `get`, `assign`, `eval` and `parse` is a sign that you are trying to make R into a macro language like SAS or SPSS. The result is a fairly inefficient and turgid programming style. – IRTFM Nov 24 '15 at 02:27
  • @42- : could you point me toward a demonstration of what you mean? – theforestecologist Nov 24 '15 at 02:56

1 Answers1

0

One solution is to use eval() and parse():

p.val <- eval(parse(text=paste0("summary(",mod.name,")[",paste0(attr.name),"]")))

So the function would be:

mod.pvals <- NULL
p.func <- function(attr.name) {
  for(i in c('dat1','dat2')) {
    mod.name <- paste0('fit.',i)
    p.val <- eval(parse(text=paste0("summary(",mod.name,")[",attr.name,"]")))
    mod.pvals <- c(mod.pvals,p.val)
  }
mod.pvals
}

r.vals <- p.func(attr.name = '\'r.squared\'')
p.vals <- p.func(attr.name = '[\'coefficients\']][2,4')
theforestecologist
  • 4,667
  • 5
  • 54
  • 91