0

I am trying to loop my multiple linear regression plot and summaries, but I keep encountering an error in R that states Error: More than one expression parsed. I am not sure how to fix this or if there is a better way to achieve what I want to do which is mainly:

  1. Plot a multiple linear regression plot with Group as the colour
  2. Get summary for each of the linear regression lines based on Group
  3. Compute regression summary
  4. Perform anova to determine differences
colNames <- names(df)[c(35:39)]
for(i in colNames){
  plt <- ggplot(df, 
aes_string(x=df$MachineLength, y=i, fill=df$Group, color=be_nlyl$Group)) + 
geom_smooth(method=lm) + 
geom_point(size = 2, alpha=0.7) + 
labs(title="Machine", subtitle = "Machine Type") + 
theme_bw() + 
theme(plot.title = element_text(hjust=0.5, face="bold"), 
plot.subtitle = element_text(hjust=0.5))
  print(plt)
  lm_A <- lm(formula = i ~ MachineLength, data = subset(be_nlyl, Group == "A"))
  summary(lm_A) %>% print()
lm_B <- lm(formula = i ~ MachineLength, data = subset(be_nlyl, Group == "B"))
  summary(lm_B) %>% print()
  clz.lm <- lm(formula = i ~ Group + MachineLength + Group:MachineLength, data = df)
summary(clz.lm) %>% print()
  ano.lm <- Anova(lm(i ~ MachineLength*Group, data = df))
  print(ano.lm)
}

Anyone have ideas of how to implement above? Thank you!

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
smicaela
  • 109
  • 8
  • `j <- as.symbol(i); lm_A <- eval(bquote(lm(formula = .(j) ~ MachineLength, data = subset(df, Group == "A"))))` – Roland Sep 10 '20 at 09:15
  • And your ggplot2 `aes` code is probably not correct either. Remove the `df$` from it and quote all variables except `i`. – Roland Sep 10 '20 at 09:17

1 Answers1

2

Try the following :

  1. Create lists of length colNames to store all the outputs so that instead of just printing the output we can store them as well.

  2. Use for loop over the index of colNames instead of actual column names so that you can use that as an index to store the output for different objects.

  3. aes_string has be deprecated so we use .data pronoun to pass column name as variable.

  4. Use sprintf to create formula string which is passed in lm function.

library(ggplot2)

colNames <- names(df)[c(35:39)]
plt <- vector('list', length(colNames))
lm_A <- vector('list', length(colNames))
summary_lm_A <- vector('list', length(colNames))
summary_lm_B <- vector('list', length(colNames))
lm_B <- vector('list', length(colNames))
clz.lm <- vector('list', length(colNames))
summary_clz.lm <- vector('list', length(colNames))
ano.lm <- vector('list', length(colNames))

for(i in seq_along(colNames)) {
  var <- colNames[i]
  plt[[i]] <- ggplot(df, aes(MachineLength, .data[[var]], fill= Group, color= Group)) + 
               geom_smooth(method=lm) + 
               geom_point(size = 2, alpha=0.7) + 
               labs(title="Machine", subtitle = "Machine Type") + 
               theme_bw() + 
               theme(plot.title = element_text(hjust=0.5, face="bold"), 
                     plot.subtitle = element_text(hjust=0.5))
  lm_A[[i]] <- lm(sprintf('%s~MachineLength', var), data = subset(df, Group == "A"))
  summary_lm_A[[i]] <- summary(lm_A[[i]])
  lm_B[[i]] <- lm(sprintf('%s~MachineLength', var), data = subset(df, Group == "B"))
  summary_lm_B[[i]] <- summary(lm_B[[i]])
  clz.lm[[i]] <- lm(sprintf('%s~Group + MachineLength + Group:MachineLength', var), data = df)
  summary_clz.lm[[i]] <- summary(clz.lm[[i]])
  ano.lm[[i]] <- Anova(lm(sprintf('%s~MachineLength*Group', var), data = df))
}
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thank you!! I just needed to update the `printf()` to `sprintf()` in 3rd last line to remove the error I got at first and the code ran perfect after! If there was a way to have the column name shown instead of `%s~` in the printed output, that would be great, right now it shows `Call: lm(formula = sprintf("%s~MachineLength", var), data = subset(df, Group == "Case"))` but not sure if can show `Call: lm(formula = sprintf("TypeA~MachineLength", var), data = subset(df, Group == "Case"))` instead? – smicaela Sep 10 '20 at 13:35
  • 1
    Thanks. I corrected the `sprintf` typo. Changing that call is isn't straightforward. You have to use `eval` + `call` there. See this https://stackoverflow.com/questions/38558523/showing-string-in-formula-and-not-as-variable-in-lm-fit – Ronak Shah Sep 10 '20 at 15:06