0

I love R but unfortunately I have so much more to learn which i definitely want.

The data:

Classes 'grouped_df', 'tbl_df', 'tbl' and 'data.frame': 3550 obs. of  18   variables:
$ SAMPLE.ID  : Factor w/ 150 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ COMMUNITY  : chr  "com.1" "com.1" "com.1" "com.1" ...
$ NUTRIENT   : Factor w/ 25 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
$ RATIO      : Factor w/ 23 levels "3.2","4","5.4",..: 11 9 6 4 1 14 10 8 5 2 ...
$ PHOS       : Factor w/ 5 levels "0.09","0.195",..: 5 5 5 5 5 4 4 4 4 4 ...
$ NIT        : Factor w/ 5 levels "1.5482","3.0964",..: 5 4 3 2 1 5 4 3 2 1 ...
$ DATUM      : Factor w/ 35 levels "30.08.16","31.08.16",..: 1 1 1 1 1 1 1 1 1 1 ...
$ DAY        : int  0 0 0 0 0 0 0 0 0 0 ...
$ TYPE       : chr  "mono" "mono" "mono" "mono" ...
$ ALGAE      : Factor w/ 6 levels "ANK","CHLA","MIX A",..: 5 5 5 5 5 5 5 5 5 5 ...
$ MEAN       : num  864 868 882 873 872 ...
$ GROW       : num  0.00116 0.00115 0.00113 0.00115 0.00115 ...
$ FLUORO     : num  NA NA NA NA NA NA NA NA NA NA ...
$ MEAN.MQ    : num  0.964 0.969 0.985 0.975 0.973 ...
$ GROW.MQ    : num  1.04 1.03 1.02 1.03 1.03 ...
$ carbon     : num  -764 -913 -1394 -1085 -1039 ...
$ carbon.unit: chr  "mikro g per litre" "mikro g per litre" "mikro g per litre" "mikro g per litre" ...
$ growthrate : num  NA NA NA NA NA NA NA NA NA NA ...

What i am looking for:

I already created a ton of other plots but everytime i write the same code for each plot which i then have to edit manually and i really need to work on the effort/result balance.

I would like to generate a ggplot with

DAY for the X axis and growthrate for the Y axis

I need to generate such a plot for all combinations of ALGAE & NUTRIENTS.

Since it is a lot of plots, it would be helpful if the title would adjust. I would further like to store them in a list, like this

plot_list <- list() 

I know the ggplot code should contain aes_string instead of aes but i have looked through so many questions now and i cannot, for the life of me, figure this out.

Help would lead to serious relief, gratitude and even a skipped heart beat

3 Answers3

1

Please try to include reproducible data in the future. See here for more examples. For this answer, I will use the diamonds dataset that comes with ggplot2

The simplest is to use facet_grid, which is intended for this very purpose:

ggplot(diamonds
       , aes (x = carat
              , y = price)) +
  geom_smooth() +
  facet_grid(clarity~color)

Gives a panel for each pair of clarity/color (like your factors of interst).

enter image description here

If you must have these each as separate plots for some reason, a nested lapply like this should work. It is setting each of the two factor levels, then filtering the data down to just those rows that match. (The do is to put them all back in a single list, instead of a list of lists). Note that this is using dplyr for the filtering steps (and to load the pipe).

allPairs <-
  lapply(levels(diamonds$clarity), function(thisClarity){
    lapply(levels(diamonds$color), function(thisColor){
      diamonds %>%
        filter(clarity == thisClarity
               , color == thisColor) %>%
        ggplot(aes(x = carat
                   , y = price)) +
        geom_smooth() +
        ggtitle(paste0("Clarity: ", thisClarity
                       , "\nColor: ", thisColor))
    })
  }) %>%
  do.call(c, .)

Then, you can handle that list of plots however you are currently handling them.

Community
  • 1
  • 1
Mark Peterson
  • 9,370
  • 2
  • 25
  • 48
0

This should do the trick:

plot_list <- list()

for(i in levels(data$ALGAE)) { for(j in levels(data$NUTRIENT)) { dat = data[data$ALGAE == i & data$NUTRIENT == j,] plt <- ggplot(data = dat, aes(x = DAY, y = GROW)) + geom_point() plot_list[[paste(i, j, sep = "_")]] = plt } }

Good luck!

PaulH
  • 181
  • 1
  • 5
  • Oh I see Mark Peterson already provided a more elegant answer:) – PaulH Dec 05 '16 at 20:00
  • maybe his is more elegant but yours is actually the one i was interested in :) not that his isnt awesome as well but i really wanted to see it as a loop, so thank you! – i.love.broccoli Dec 05 '16 at 20:07
  • I do like the for loop approach here, but I will point out @i.love.broccoli : it is generally best practices to lean on the *`apply` functions when possible over the use of `for` for looping. There can be some performance gains (notably when moving to parallel computing), but the primary benefits are clarity (it is generally easier to see what is happening at each step) and the `apply` family of functions do not have side effects (i.e., only things explicitly returned enter the global environment; e.g., you won't have `j` or `dat` or `plt` in the environment, and can't accidentally change data – Mark Peterson Dec 05 '16 at 20:21
  • I totally agree Mark! – PaulH Dec 05 '16 at 20:25
0

personally I like to use plyr/dplyr for this, it's more compact than for loops, and more readable than lapply,

p <- ggplot(diamonds, aes(x = carat, y = price)) +  geom_smooth()
allPairs2 <- plyr::dlply(diamonds, .(clarity, color), "%+%", e1=p)
grid.arrange(grobs = allPairs2)

enter image description here

I use %+% as a shortcut (it overwrites the data for the "template" plot p), but an anonymous function to explicitly define the plot could more readable,

allPairs2 <- plyr::dlply(diamonds, .(clarity, color), 
                function(d) {ggplot(d, aes(x = carat, y = price)) +  geom_smooth()})
baptiste
  • 75,767
  • 19
  • 198
  • 294