0

Apologies if this has been answered before but I could not find the answer anywhere. I have a regression with 5 different outcome variables and 10 different explanatory variables, so I use two loops to run the model as follows:

for(i in 1:length(outcome)){
    for(j in 1:length(explanatory)){
      reg[[i]] <- glm(as.formula(paste(outcome[i],"~",explanatory[j])), data=mydata, family=binomial)
      assign(paste0("reg", i, j), reg[[i]])
  }
}

This way I have for example reg11 is the regression with the first outcome and first explanatory variable, and for example reg 310 is the regression with the third outcome and the 10th explanatory variable.

Now I want to extract the betas from each regression to create new dataframes, and I use the following:

for(i in 1:5){
  for(j in 1:10){
  betas <- reg[[i,j]]$"coefficients"
  }
}

However, it seems that the syntax for [[i,j]] is wrong. I have tried [[i,j]], [[i]][[j]], and numerous other combinations but neither seems to work. How should I spell it so that R understands which regression I am referring to?

Thank you very much!

maxjohnson
  • 29
  • 5
  • How did you define `reg` – akrun May 17 '21 at 18:59
  • What exactly is `base.reg3`? That doesn't seem to be created in your first loop. Is it a data.frame? It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Also it's really not a good idea to use `assign` to create variables in your global environment. Things are much easier if you put your results in a proper list. – MrFlick May 17 '21 at 18:59
  • Apologies it should only say "reg" not "base.reg3", I amended the question. I am trying to switch to using lists as it seems everyone agrees it is much easier. But I thought in this case it was just a matter of syntax as to how to refer to the regression with i and j? – maxjohnson May 17 '21 at 19:01

2 Answers2

1

I think the issue here is that you are assigning the formula to reg[[i]] within the j loop. What this means is that you will overwrite each value of reg[[i]] at each iteration of j in the j loop during the same iteration of the i loop. You will need a list within a list to achieve your desired outcome. Below is an example of what I think you want

##Creating some dataframes for fake data##
outcome<-as.data.frame(matrix(data=NA, nrow=5, ncol =10))
explanatory<-as.data.frame(matrix(data=NA, nrow=10, ncol =10))

##Functions to produce fake data##
obs.vars<-function(X){runif(10, 5, 100)}
resp.vars<-function(X){sample(c(0,1), 10, replace=TRUE)}

##Populating dataframes with fake data##
outcome<-apply(outcome, 2, resp.vars)  
explanatory<-apply(explanatory, 2, obs.vars)

##Create an empty list to save results of i loop##
reg<-NULL
##Looping to perform regressions
for(i in 1:ncol(outcome)){# i loop for response variables
  tmp<-list()#empty list to save results of j loop
    for(j in 1:ncol(explanatory)){# j loop for explanatory variables
    tmp[[j]]<- glm(outcome[,i]~explanatory[,j], family=binomial)# save regression model to jth element of tmp
    assign(paste0("reg", i, j), tmp[[j]])
    }#close j loop
  reg[[i]]<-tmp #save tmp list to ith element of reg 
}#close i loop

##Similar set up to extract coefficients from models##
##list to save results of i loop
betas<-NULL

#Looping to extract coefficients
for(i in 1:5){#i loop for response variables
  tmp<-NULL#empty list to save values from j loop
  for(j in 1:10){#j loop for explanatory variables
    tmp[[j]]<- cbind(reg[[i]][[j]]$"coefficients", data.frame(Exp_Var = j))
  }#close j loop 
  betas[[i]]<-cbind(data.frame(Obs_Var = i),do.call(rbind, tmp))#convert tmp to dataframe and save it to ith element of betas
}#close i loop
betas<-do.call(rbind, betas)# convert betas to dataframe

Take Care, -Sean

Dharman
  • 30,962
  • 25
  • 85
  • 135
Sean McKenzie
  • 707
  • 3
  • 13
1

By assigning all regressions with different values for j to the same list element (reg[[i]]) you constantly overwrite the old regression models. You only keep the last one for each value of i.

I would use your logic of pasting i and j to a label for the regression to store them in an orderly fashion:

reg = list()
for(i in 1:length(outcome)){
    for(j in 1:length(explanatory)){
      label = paste("reg",i,"-",j,sep="")
      reg[[label]] <- glm(as.formula(paste(outcome[i],"~",explanatory[j])), data=mydata, family=binomial)
  }
}

So, you get a list with items that are called "reg1-1" or "reg3-10". Now, you can loop through this list.

  • POA: If you access the coefficients element with the $-sign notation, you don't need quotes:
betas = c()
models = c()
ivals = c()
jvals = c()
for(model in names(reg)){
    coefs = reg[[model]][["coefficients"]]
    beta = coefs[2] ## Or wherever your beta of interest is located in the coefficients
    ij = strsplit(substr(model,4,nchar(model)),"-")[[1]] ## Get i and j from the name
    betas=c(betas,beta)
    models=c(models,model)
    ivals=c(ivals,ij[1])
    jvals=c(jvals,ij[2])
}

dta = data.frame("Model"=models,
                 "Beta"=betas,
                 "I"=ivals,
                 "J"=jvals)

So, you get a neat little data frame with all your betas and their corresponding values for i and j.

You could also do the complete second loop within the first loop where you already know i and j and don't have to pry them from the name of the model. But I tried to keep as close to your solution as possible.

Martin Wettstein
  • 2,771
  • 2
  • 9
  • 15