0

I have a data set where I want to plot columns against each other i an facet wrap or grid arrange fashion. My data set called SardF.plot:

str(SardF.plot)
'data.frame':   42 obs. of  22 variables:
$ Sample  : chr  "Kx V17" "Mu V17" "Ob V17" "Vä V17" ...
$ Mill    : chr  "Kx" "Mu" "Ob" "Vä" ...
$ Halfyear: chr  "V17" "V17" "V17" "V17" ...
$ pH      : num  12.4 11.56 9.84 12.84 10.68 ...
$ Na      : num  59199 22604 9095 30052 18014 ...
$ K       : num  5547 1994 1345 8238 1276 ...
$ Ca      : num  23.6 17.6 68.4 22.2 17.9 ...
$ Cr      : chr  "0.90659967379114681" "1.6064235870083998E-2" "      
$ Ni      : num  0.0314 0.036 0.1208 0.0396 0.041 ...
$ Cu      : num  0.0786 0.4648 0.0656 0.4747 0.2705 ...
$ Zn      : num  0.244 0.384 0.269 0.748 0.205 ...
$ Cd      : num  0.00375 0.00339 0.0035 0.00216 0.00361 ...
$ Pb      : num  0.000654 0.00148 0.000644 0.008429 0.000576 ...
$ Na Fast : num  70848 53117 22256 84498 27894 ...
$ K Fast  : num  6392 4732 3238 9547 2158 ...
$ Ca Fast : num  175958 140652 150944 240352 141438 ...
$ Cr Fast : num  150 102 124 83 256 65 17 17 41 418 ...
$ Ni Fast : num  54.8 73.5 210.5 38.8 170.7 ...
$ Cu Fast : num  155 614 589 208 453 ...
$ Zn Fast : num  1493 5909 5145 2074 3582 ...
$ Cd Fast : num  6.02 14.25 12.67 7.36 14.47 ...
$ Pb Fast : num  27.2 47.2 11.1 23.4 16.5 9.6 3.1 8.2 12.5 30 ...

I want to plot column5 against column13, and column6 against column14 etc. I have successfully used facet_wrap when all the plots have the same x (Element is a vector containing Na-Pb):

ggplot(gather(Sardinia.plot, key=Element, value="value", -"pH", -"Sample", -"Mill", -"Halfyear"),
  aes(x=pH, y=as.numeric(value), colour=Mill, group=Sample) )  + 
  geom_point(aes(shape=Halfyear)) + 
  facet_wrap(~ Element, scales = 'free') +
  ggtitle("pH VS leached amount at LS 10") +
  ylab("mg leached/kg GLD")+
  xlab("pH")+
  theme(plot.title = element_text(hjust = 0.5), 
       legend.title = element_blank())

and I can successfully generate the plots I want as individual plots and save them by using a for loop:

for (j in 0:1)
{  
setwd("\\\\orunet\\dfs\\home07\\nse\\my documents\\LS lakningar\\R bilder\\mg ut per kg mot fast Sardinia") 
for (i in 5:13)
{
    Fast <- i+9
    myplot<-ggplot(SardF.plot) +
    geom_point( aes(x=SardF.plot[[Fast]], y=as.numeric(SardF.plot[[i]]), colour=Mill, shape=Halfyear), size=3 ) + 
    ggtitle(colnames(SardF.plot[i])) +
    xlab("mg/kg in solid GLD") +
    ylab("mg leached/kg GLD") +
    theme(plot.title = element_text(hjust = 0.5), 
        legend.title = element_blank()
    ) 
    ID <-colnames(SardF.plot[i])
    ggsave(myplot, filename=paste(ID,".jpeg",sep=""), width = 16, height = 15, units = "cm")
    }
    setwd("\\\\orunet\\dfs\\home07\\nse\\my documents\\LS lakningar") 
 }

But I can not figure out how to take the above plots and paste them all on a single figure. I can not use Grid.arrange since the plots generated by the loop can not be called as they all are stored into myplot and I have not figured out how to store them in individual names (and I really really do not want to use copy and paste with p1...n<- instead of a loop as I can have up to 40 graphs). Using the above facet_wrap with specifying x as and y as SardF["x-colums"] and SardF["y-colums"] did not work either. I am stuck, can anyone help me?

Nanna
  • 73
  • 8
  • 1
    Make a list of plots. `plot_list = list()` before the loop and `plot_list[[i]] <- myplot` in the loop. (Generally that works, for loops that start at 1. You might need to create a special counter for your case. Or name them instead of using indices, `plot_list[[id]] <- myplot`) – Gregor Thomas May 23 '19 at 15:28
  • @Nanna: doing analysis by column number is not recommended as the order can change thus produce undesirable results. It's better to use column name matching instead. See these examples: https://stackoverflow.com/a/55524126/786542 & https://stackoverflow.com/a/50522928/786542 – Tung May 23 '19 at 16:14
  • @Gregor Using Plot_list in the loop does not work, it only saves the last plot. The title changes nut the actual content is the same. So if I use plot_list$Na what i get is the plot for Pb (last in the loop) but with the ggtitle "Na" . – Nanna May 24 '19 at 16:59
  • @Tung I would like to do name match but unfortunately the examples are just to advanced for me. The second example I could not even begin to understand the code and the first one I don't think is applicable, I want the colors and shapes the same for all, just that x and y changes and the code in the example is to advanced for me to "reverse engineer" for my situation without actually seeing the plots. I am sorry, your are right that column name match is better but I am just not good enough to get it to work. – Nanna May 24 '19 at 17:05

1 Answers1

2

Here's a stripped down example. First, I create a dummy data frame (as you didn't provide any copy-and-pastable data). In this example, I'll plot column 1 against column 9, 2 against 10, 3 against 11, etc., but you can easily adapt this for your particular case.

# Dummy data frame
df <- data.frame(A = runif(10),
                 B = runif(10),
                 C = runif(10),
                 D = runif(10),
                 E = runif(10),
                 F = runif(10),
                 G = runif(10),
                 H = runif(10),
                 I = runif(10),
                 J = runif(10),
                 K = runif(10),
                 L = runif(10),
                 M = runif(10),
                 N = runif(10),
                 O = runif(10),
                 P = runif(10))

Next, I load the libraries.

# Load libraries
library(dplyr)
library(ggplot2)

Here, I define a function that will create and return a plot. It accepts an argument i that is the position of the first column and it figures out what the second column should be based on an offset. In my example, that offset is 8.

# Plotting function
myplot <- function(i, offset = 8){
  # Subset data frame
  df_plot <- df %>% select(i, i + offset)

  # Plot one column against another
  g <- ggplot(df_plot) + geom_point(aes_string(x = names(df_plot)[1], y = names(df_plot)[2]))
}

Then, I apply the function using lapply to create a list of plots. The first argument to lapply is the indices of the variable to be plotted on the x-axis. In my example, that's 1-8.

# Create list of plots
plist <- lapply(1:(ncol(df)/2), myplot)

This gives me a list that I can pass to cowplot's plot_grid function, producing the following.

# Plot all together with cowplot
cowplot::plot_grid(plotlist = plist, align = "hv")

Created on 2019-05-23 by the reprex package (v0.2.1)

Dan
  • 11,370
  • 4
  • 43
  • 68
  • I can not get it to work. Either I get an error of "Error in parse(text = x) : :1:4: unexpected symbol" or if I use aes instead of aes_string and remove names() I get: "Error in is.finite(x) : default method not implemented for type 'list'". There just does not seem to be a way for me of saving plots generated by loops. – Nanna May 24 '19 at 17:24
  • @Nanna If you copy and paste the code above into a fresh session, does it work as shown? – Dan May 24 '19 at 17:26
  • @ Lyngbakr, if I copy the code dummy df and all the code runs without error or warnings but no plots appear in the plot window (I use R studio). The df is created, but df_plot seems to still contain my data (as in the selected columns from my original data frame SardF.plot as I first tried to use SardF.plot instead of the dummy df). – Nanna May 24 '19 at 17:32
  • I have restarted R studio, created a new script and tried the code exactly as is and df is created, the myplot function seems to be accepted but when running the lapply code df_plot is not created. No error or warnings. I created all the plots manually and named them pNa etc. Individually after they are made all the plots are correct but if i run plot_grid or grid.arrange the plots are remade into the last plot created (pPb not the last plot feed to plot_grid). And if I use print(pNa) the actual plot shown are the last plot created pPb but with the title for pNa (but the dots are wrong). – Nanna May 24 '19 at 18:25
  • 2
    Just a suggestion for your data creation, `df = as.data.frame(replicate(16, runif(10)))`. If you want the names too `names(df) = LETTERS[1:16]`. – Gregor Thomas May 24 '19 at 18:29
  • 1
    `df_plot` is a temporary data frame that is created when `myplot` is called, but does not persist afterwards. It *shouldn't* exist after the code is run. What *should* exist is `plist` – that is, the list of plots. – Dan May 24 '19 at 18:31
  • @Gregor Thanks for the tip! That's way more concise than my approach. – Dan May 24 '19 at 18:33
  • I think that the problem lies with my R studio and the versions of the libraries and not the code itself. Otherwise your example should have worked. I will keep working on it but for now i will leave it unsolved just in case some else have had the same problem and found a solution. Thanks for all the help! :-) – Nanna May 29 '19 at 08:13