0

I've a code to plot some data from a dataframe. When I run the code without the loop it works, but when I do it using the code and then do plots[[1]] plots[[2]] all plots are the same but the labs are different.

What I'm doing wrong? Thanks

#-- Plots
plots <- list()
combinations <- combn(dim(ssgsea.diff)[2],2)

for (i in 1:dim(combinations)[2]){

  fit <- lm(ssgsea.diff[,combinations[1,i]] ~ ssgsea.diff[,combinations[2,i]], 
            data = ssgsea.diff)

  plot1 <- ggplot(ssgsea.diff) + 
    aes(ssgsea.diff[,combinations[1,i]], ssgsea.diff[,combinations[2,i]]) + 
    geom_point()

  plot1 <- plot1 + 
    labs(x = names(ssgsea.diff)[combinations[1,i]], 
         y = names(ssgsea.diff)[combinations[2,i]]) + 
    geom_smooth(method="lm", col = "red") + 
    labs(title = paste("Adj R2 = ", signif(summary(fit)$adj.r.squared, 5), 
                       " Slope =",signif(fit$coef[[2]], 5), 
                       " Pval =",signif(summary(fit)$coef[2,4], 5)))

  plots[[i]] <- plot1
}
Z.Lin
  • 28,055
  • 6
  • 54
  • 94
HeyHoLetsGo
  • 137
  • 1
  • 14
  • Can you provide the dataset used here in `dput()` form? Alternatively make this problem reproducible using one of the common datasets from `datasets` package. – Z.Lin Sep 06 '17 at 09:49
  • I cannot provide the dataset. But basically, there is something in the code, that all the plots saved in the list `plots` look like the last plot created. This happens even when the code works of out of the loop. – HeyHoLetsGo Sep 06 '17 at 10:55

1 Answers1

4

Since no data is provided, I'll attempt to demonstrate with a common dataset from R:

data(airquality)
ssgsea.diff <- as.data.frame(airquality)

plots <- list()
combinations <- combn(dim(ssgsea.diff)[2],2)

Reproduce the problem using the original code. There are 15 plots after the for loop's completion, & indeed they all share the same graph with different labels:

for (i in 1:dim(combinations)[2]){

  fit <- lm(ssgsea.diff[,combinations[1,i]] ~ ssgsea.diff[,combinations[2,i]], 
            data = ssgsea.diff)

  plot1 <- ggplot(ssgsea.diff) + 
    aes(ssgsea.diff[,combinations[1,i]], ssgsea.diff[,combinations[2,i]]) + 
    geom_point()

  plot1 <- plot1 + 
    labs(x = names(ssgsea.diff)[combinations[1,i]], 
         y = names(ssgsea.diff)[combinations[2,i]]) + 
    geom_smooth(method="lm", col = "red") + 
    labs(title = paste("Adj R2 = ", signif(summary(fit)$adj.r.squared, 5), 
                       " Slope =",signif(fit$coef[[2]], 5), 
                       " Pval =",signif(summary(fit)$coef[2,4], 5)))

  plots[[i]] <- plot1
}

Solution below will result in 15 different plots stored in the list (I also defined the column indices first, to make the code more readable):

for (i in 1:dim(combinations)[2]){

  # define column indices & column names first
  C1 <- combinations[1, i]; C1.name <- names(ssgsea.diff)[C1]
  C2 <- combinations[2, i]; C2.name <- names(ssgsea.diff)[C2]

  fit <- lm(ssgsea.diff[, C1] ~ ssgsea.diff[, C2]) # no need to specify data here

  plot1 <- ggplot(ssgsea.diff) + 
    aes_string(C1.name, C2.name) + geom_point()

  plot1 <- plot1 + 
    labs(x = C1.name, y = C2.name) + 
    geom_smooth(method="lm", col = "red") + 
    labs(title = paste("Adj R2 = ", signif(summary(fit)$adj.r.squared, 5), 
                       " Slope =", signif(fit$coef[[2]], 5), 
                       " Pval =", signif(summary(fit)$coef[2, 4], 5)))

  plots[[i]] <- plot1
}

rm(C1, C2, C1.name, C2.name, fit, plot1, i)

Explanation: Due to the way you specified the aesthetic mapping for ggplot, all the plot objects were evaluated at the end of the loop with the current (i.e. last) value of i. A safer method would be to take advantage of the data argument in ggplot, which stores the data frame in the ggplot object itself as a copy rather than a reference. When i changes through each loop, the data copy in the ggplot object does not change. Answers in these two posts discuss this further: Storing ggplot objects in a list from within loop in R & Storing plot objects in a list.

Z.Lin
  • 28,055
  • 6
  • 54
  • 94