0

I would like to know how to refer to elements in a loop in R. In STATA, it is done through `var' inside a loop. I am working with loops and I want to refer to the variables inside each loop while regressing these variables on a list of variables (x1 x2 x3). x1 variable also has suffixes so that the name can be split into several shorter parts. The code I would make in STATA would be:

foreach credit in "short_term" "medium_term" "long_term" {
    foreach percentile in "p50" "p75" "p90" {
        foreach type in "high4" "high5" "high6" {
            reg y_-credit' x1_-percentile '_`type' x2 x3 
        }
    }
} 

In R, if I create a list and make a loop, how do I refer to each element in the list? For instance:

credit <- c("short_term","medium_term","long_term") 
percentile <- c("p50","p75","p90") 
type <- c("high4","high5","high6") 

for (c in credit) {
    for (p in percentile) {
        for (t in type) {
            baseline_[c]_[p]_[t] <- lm(y_[c] — xl_[p]_[t] + x2 + x3)
        } 
     }
}

And then get a .txt file using sink to get all results (summary(baseline) for all baselines) together.

I hope my illustration was adequate in explaining my doubt. I am struggling with loops because of this (minor - when compared to STATA's `var') issue.

I await your response.

Thank you, Pranav

kath
  • 7,624
  • 17
  • 32
  • 2
    Please don't post code as an image. Also show, what you tried in R and where you got stuck. – kath May 11 '18 at 19:03
  • 1
    Dear Kath, I am new to StackOverflow. I was unable to add the code properly and hence I uploaded an image. I shall try again to describe the code I tried on R (although the image shows what I wrote). Thank you, Pranav – Pranav Garg May 11 '18 at 22:58
  • Can you also share what kind of structures `baseline`, `xl`, and `y` have? Have a look at [how-to-make-a-great-r-reproducible-example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) if you don't know how to share your data. – kath May 12 '18 at 13:09
  • Dear Kath, I read the rules. I do not have access to the data (it is confidential). I apologize for the poorly framed question. Shall work on it. – Pranav Garg Jun 01 '18 at 10:53

1 Answers1

1

One can use the formula() function to generate formulas from strings in R.

Since the OP isn't reproducible, we'll demonstrate formula() by using the mtcars data set:

data(mtcars) # Use motor trend cars data set
dvs <- c("mpg","qsec")
ivs <- c("am","wt","disp")
for(d in dvs){
     for(i in ivs){
          message(paste("d is: ", d, "i is: ",i))
          print(summary(lm(formula(paste(d,"~",i)),mtcars)))
     }
}

...and the first part of the output:

> for(d in dvs){
+      for(i in ivs){
+           message(paste("d is: ", d, "i is: ",i))
+           print(summary(lm(formula(paste(d,"~",i)),mtcars)))
+      }
+ }
d is:  mpg i is:  am

Call:
lm(formula = formula(paste(d, "~", i)), data = mtcars)

Residuals:
    Min      1Q  Median      3Q     Max 
-9.3923 -3.0923 -0.2974  3.2439  9.5077 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   17.147      1.125  15.247 1.13e-15 ***
am             7.245      1.764   4.106 0.000285 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.902 on 30 degrees of freedom
Multiple R-squared:  0.3598,    Adjusted R-squared:  0.3385 
F-statistic: 16.86 on 1 and 30 DF,  p-value: 0.000285

Since the output from lm() can be saved in an object, one can also generate a list() of model objects, and manipulate them further in R.

To generate named variables for the formula() statement from vectors containing elements of the desired variable names, one can use the paste() or paste0() functions in a manner similar to the approach taken above with the mtcars data set. paste0() defaults to no spaces between arguments, where as paste() defaults to adding space between the arguments.

Again, making some guesses as to the actual intended formulae, we'll use the OP nested for() loops to generate strings that can be used with formula() in an lm() function.

# 
# generate formulas using content from OP
# 
credit <- c("short_term","medium_term","long_term") 
percentile <- c("p50","p75","p90") 
type <- c("high4","high5","high6") 

for (c in credit) {
     for (p in percentile) {
          for (t in type) {
               aFormula <- paste0("y_",c," ~ x1-",p,"_",t," + x2 + x3")
               print(aFormula)
          } 
     }
}

...and the start of the output:

> credit <- c("short_term","medium_term","long_term") 
> percentile <- c("p50","p75","p90") 
> type <- c("high4","high5","high6") 
> 
> for (c in credit) {
+      for (p in percentile) {
+           for (t in type) {
+                aFormula <- paste0("y_",c," ~ x1_",p,"_",t," + x2 + x3")
+                print(aFormula)
+           } 
+      }
+ }
[1] "y_short_term ~ x1_p50_high4 + x2 + x3"
[1] "y_short_term ~ x1_p50_high5 + x2 + x3"
[1] "y_short_term ~ x1_p50_high6 + x2 + x3"
[1] "y_short_term ~ x1_p75_high4 + x2 + x3"
[1] "y_short_term ~ x1_p75_high5 + x2 + x3"
[1] "y_short_term ~ x1_p75_high6 + x2 + x3"
[1] "y_short_term ~ x1_p90_high4 + x2 + x3"
[1] "y_short_term ~ x1_p90_high5 + x2 + x3"
[1] "y_short_term ~ x1_p90_high6 + x2 + x3"
[1] "y_medium_term ~ x1_p50_high4 + x2 + x3"
[1] "y_medium_term ~ x1_p50_high5 + x2 + x3"
[1] "y_medium_term ~ x1_p50_high6 + x2 + x3"
. 
. 
. 

Note that the content in the OP inconsistently uses - vs. _, so I used _ at all relevant spots in the formulae.

Len Greski
  • 10,505
  • 2
  • 22
  • 33