0

I'm using the grid and grid.Extra packages to have multiple ggplots on one grid, however, I need to make 28 ggplots in total, each one with a different y variable. Obviously the loop to make and save them as a variable is simple enough, but I need to save each plot as a different object. Currently I just have 28 of the same plot code with only the axis changed, but I know that there's a better way.

My minimum code example:

dattn <-  ggplot()+
  geom_boxplot(
  data = dat, 
  mapping = aes(x = site, y = tn)
)

My current loop heading (not sure if it makes any difference):

for (i in dat[, 10:12])

I need to switch the y value to tp and tss and save those ggplots as dattp and datss. The columns of the variables are 10, 11, and 12 in that order.

While doing research for this problem, I came across this question Dynamically naming variables in a loop to create 30 ggplots which is very similar to my problem, however it uses pipe functions and I have no idea how to use them whereas I have some idea of using the for loop. If anyone thinks that would work better I'd be grateful to be told so.

I have also tried the code paste("dat", i, sep="") <- ggplot... specifically from the question Save ggplot objects in loop but that gives me an error message target of assignment expands to non-language object and again uses pipe functions.

I'll update my question with any more necessary info as it's needed. Thanks in advance.


The reason I requested help using for loops is because I also need to subset my data based on location ("farm" in the data) and I figured once I could change the text of a loop I would be able to change how I subset the data and use nested loops.


Data using dput(head(dat)):

structure(list(year = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("2018", 
"2019"), class = "factor"), month = structure(c(4L, 5L, 5L, 6L, 
7L, 7L), .Label = c("1", "2", "3", "4", "5", "6", "7", "9", "10", 
"11", "12"), class = "factor"), day = c(24L, 18L, 30L, 25L, 6L, 
19L), farm = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("ARR", 
"Car", "CAR", "Mur", "Muz", "PBR", "Pre", "PRE", "Sch", "SCH", 
"Sim", "SIM", "STU"), class = "factor"), treat = structure(c(1L, 
1L, 1L, 1L, 1L, 1L), .Label = c("CC", "Con"), class = "factor"), 
    tss = c(1955, 3540, 4893.3, 410, 3357.5, 1836), tn = c(17, 
    32.8, 7.26, 5.91, 16.1, 16.7), tp = c(4.35, 10, 49.5, 3.57, 
    9.79, 11.1), dis = c(8178, 184232, 401364, 1113947, 10728, 
    21869), tss.1 = c(1.576347171, 64.30227415, 193.6414217, 
    45.03046056, 3.551344392, 3.958763937), tn.1 = c(0.013707367, 
    0.595795082, 0.28729829, 0.649097614, 0.017029529, 0.036008365
    ), tp.1 = c(0.003507473, 0.181644842, 1.958851976, 0.392094498, 
    0.010355223, 0.023933704), site = structure(c(1L, 1L, 1L, 
    1L, 1L, 1L), .Label = c("Car CC", "Mur CC", "Muz CC", "Pre CC", 
    "Sch CC", "Sim CC", "CAR CC", "ARR CC", "PBR CC", "PRE CC", 
    "STU CC", "Car Con", "Mur Con", "Muz Con", "Pre Con", "Sch Con", 
    "Sim Con", "SCH Con", "PRE Con", "ARR Con", "PBR Con", "SIM Con", 
    "STU Con"), class = "factor")), row.names = c(NA, 6L), class = "data.frame")

And some formatting code:

# Set variables as factors
cols <- c("year", "month")
dat[cols] <- lapply(dat[cols], as.factor)

# Set dat$site to combine farm and treat
dat$site <- paste(dat$farm, dat$treat)

# Sets the sites in order instead of aphabetically.
# unique() is needed else Error: duplicates
dat$site <- factor(dat$site, levels =unique(dat$site))

I'm not sure if that helps but it was what was suggested to me.

3 Answers3

1

An option is map. Loop through the names of columns as a string and pass it on aes_string

library(tidyverse)
v1 <- c('tn', 'tp', 'tss')
out <- map(v1, ~ 
            ggplot()+
     geom_boxplot(
      data = dat, 
      mapping = aes_string(x = "site", y =.x) #or
      # mapping = aes(x, !! rlang::sym(.x))
      ))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Isn't `aes_string` on soft deprecation [per docs](https://ggplot2.tidyverse.org/reference/aes_.html) and [blog](https://cmdlinetips.com/2018/07/ggplot2-version-3-0-0-brings-tidy-evaluation-to-ggplot/)? Someone informed me of this when I suggested it. You actually [answered](https://stackoverflow.com/a/57212156/1422451) same question using `rlang` bang bangs! – Parfait Aug 05 '19 at 14:56
  • It looks like it wouldd be deprecated – akrun Aug 05 '19 at 15:03
  • This is a good breakthrough. The code now produces the 3 ggplots and stores them as a list of 3, but how do I retrieve the plots? – Orisa is your shield Aug 05 '19 at 15:03
  • @Orisaisyourshield. Using the standard list extraction methods. If you have named list, it can be extracted with `out$name1` or for general case `out[[1]]` for first list element – akrun Aug 05 '19 at 15:04
  • @akrun I really appreciate you taking your time helping me with this, I try `out[[1]]` but receive `Error in FUN(X[[i]], ...) : object 'x' not found` – Orisa is your shield Aug 05 '19 at 15:09
  • 1
    @Orisaisyourshield Are you saying that the `out <- map(...` worked without any error, but only found the error when subsetting – akrun Aug 05 '19 at 15:10
  • 1
    @Orisaisyourshield. I had a typo, `mapping = aes_string(x = "site", y =.x)`. the `site` would be quoted – akrun Aug 05 '19 at 15:11
  • @akrun, Ahh, I received errors when using the original code and assumed it was on my end so I used your alternate code, but with your edit it works. – Orisa is your shield Aug 05 '19 at 15:33
0

Consider reshaping your wide data to long format and then plot with facet_wrap or facet_grid without any complex loops, mapping, or saving many plots for gridExtra::grid. Below demonstrates with random, seeded data.

Data (assuming below structure mirrors OP's actual data)

set.seed(852019)

### DATA BUILD
random_df <- data.frame(
  site = sample(c("sas", "stata", "spss", "python", "r", "julia"), 500, replace=TRUE),
  var2 = NA,
  var3 = NA,
  var4 = NA,
  var5 = NA,
  var6 = NA,
  var7 = NA,
  var8 = NA,
  var9 = NA,
  tn = rnorm(500),
  tss = rnorm(500),
  tp = rnorm(500)
)

head(random_df)

#    site var2 var3 var4 var5 var6 var7 var8 var9         tn         tss          tp
# 1 stata   NA   NA   NA   NA   NA   NA   NA   NA  2.0237416 -1.30919981 -1.71918905
# 2     r   NA   NA   NA   NA   NA   NA   NA   NA  0.6052126 -0.27231149  0.18739618
# 3     r   NA   NA   NA   NA   NA   NA   NA   NA  1.3270657 -0.70308896  0.04996251
# 4   sas   NA   NA   NA   NA   NA   NA   NA   NA -0.8690220  0.09934931 -0.12513907
# 5 julia   NA   NA   NA   NA   NA   NA   NA   NA -1.8871174  0.08761820 -0.45409606
# 6   sas   NA   NA   NA   NA   NA   NA   NA   NA  0.3205017 -0.61696052  0.32586570

Plot

# RESHAPE WIDE TO LONG
yvars <- c("tn", "tss", "tp")

long_df <- reshape(random_df, varying = yvars, v.names = "y_value",
                   times = yvars, timevar = "y_var",
                   idvar = c("site"), drop = c(2:9),
                   new.row.names = 1:1E4, direction = "long")

head(long_df)
#    site y_var    y_value
# 1 stata    tn  2.0237416
# 2     r    tn  0.6052126
# 3     r    tn  1.3270657
# 4   sas    tn -0.8690220
# 5 julia    tn -1.8871174
# 6   sas    tn  0.3205017

# BOXPLOT WITH FACET
ggplot() + geom_boxplot(data = long_df, mapping = aes(x = site, y = y_value)) + 
  facet_wrap(~y_var)

Box Plot Output


Should you want separate graphs, still consider the long format and use by off the y_var indicator column to build a list of plots. Then plot with a gridExtra::grid.arrange:

plot_list <- by(long_df, long_df$y_var, function(sub) {
  ggplot() + geom_boxplot(data = sub, mapping = aes(x = site, y = y_value)) +
    ggtitle(sub$y_var[[1]])
}) 

do.call(grid.arrange, args=list(grobs=plot_list, nrow = 1))

enter image description here


Using OP's Sample Data

# RESHAPE WIDE TO LONG
yvars <- c("tn.1", "tss.1", "tp.1")

long_df <- reshape(df, varying=yvars, v.names="y_value",
                   times = yvars, timevar = "y_var",
                   idvar = c("site"), drop=c(2:9),
                   new.row.names = 1:1E4, direction = "long")
long_df$y_var <- gsub(".1", "", long_df$y_var)

# GRID ARRANGE PLOT
plot_list <- by(long_df, long_df$y_var, function(sub) {
  ggplot() + geom_boxplot(data = sub, mapping = aes(x = site, y = y_value)) +
    ggtitle(sub$y_var[[1]])
}) 

do.call(grid.arrange, args=list(grobs=plot_list, nrow = 1))

OP Data Plot Output

Parfait
  • 104,375
  • 17
  • 94
  • 125
  • I had considered using `facet_wrap` but my data values are wildly different, with tss being in the 1000s and tp mostly below 5. I know it's possible to set different scales but it is easier just to create separate plots and combine them with `grid`. I'll also edit my question with my data and addition explanation soon. – Orisa is your shield Aug 05 '19 at 15:49
  • Got it. But still consider the long format and run `by` on the unique indicators *(tn, tss, tp)* from long format. Then use `gridExtra.grid.arrange`. See extended answer. – Parfait Aug 05 '19 at 17:26
0

Or you can use .data[[]] to extract desired column names

library(tidyverse)

# define plotting function
plot_gg <- function(dat, x_var, y_var) {

  dattn <-  ggplot() +
    geom_boxplot(
      data = dat, 
      mapping = aes(x = .data[[x_var]], y = .data[[y_var]])
    ) +
    labs(x = x_var, y = y_var)
  return(dattn)
}

# save plots in a list
plot_list <- c('tn', 'tp', 'tss') %>% 
  map(~ plot_gg(dat, 'site', .x))
plot_list
#> [[1]]

#> 
#> [[2]]

#> 
#> [[3]]

Created on 2019-08-05 by the reprex package (v0.3.0)

Tung
  • 26,371
  • 7
  • 91
  • 115