I am trying to create data comparison plots before-after data manipulation for multiple columns in my dataset via a for loop. Eventually, I want to save all comparison plots to one pdf file. First, I generate the plot before, manipulate the data, generate plot after and want to have them side by side via ggarrange (I also tried grid.arrange from gridExtra, but this does not solve the issue). What I get, however, are identical plots AFTER data manipulation (though the titles are different).
Here is a reproducible example:
library(rlist)
library(ggplot2)
library(ggpubr)
head(iris)
plot_before <- list()
plot_after <- list()
plots <- list()
for (i in 1:4){
p <- ggplot(iris,aes(iris[,i])) + geom_histogram()+ggtitle(paste0(i,"_pre"))
print(p)
plot_before <- list.append(plot_before,p)
#do something with your data
iris[,i] <- 3*iris[,i]
p2 <- ggplot(iris,aes(iris[,i])) + geom_histogram()+ggtitle(paste0(i,"_post"))
print(p2)
plot_after <- list.append(plot_after, p2)
q <- ggarrange(p,p2) #here, p is already linked to modified data
print(q)
plots <- list.append(plots, q)
}
#try to access plots from lists
for (i in 1:4){
print(plot_before[[i]])
print(plot_after[[i]])
print(plots[[i]])
}
I suppose this has sth to do with that ggplot creates "only" a graphics object linked to the data, so the moment I print it again, it accesses the data again and fetches manipulated data instead of getting a previous "snapshot". Saving the graphs to separate lists also does not help, they are "linked" to manipulated data as well.
Is there a way to make a persistent ggplot object rather than having it linked to the data?
One could, of course create new columns with the modified data and refer to those or create a completely new dataframe, but I would like to avoid data duplication.