0

I have been working on some code to iterate creating scatter plots based on data from a data frame and exporting each scatter plot with an 2nd order regression line to a single PDF file with each page as its own scatter plot. What I would like to do is produce the regression line equation and place it in the top left margin of the scatter plot for each iteration.

library(gridExtra)
library(purrr)
library(tidyverse)

plot_5 <-
    Infil_Data2 %>% 
    split(.$Site_ID) %>% 
    map2(names(.),
         ~ggplot(.x, aes(Sqrt_Time.x, Cal_Vol_cm)) + 
         geom_point() +
         labs(title = paste(.y)) +
         theme(plot.title = element_text(hjust = 0.5)) + 
         stat_smooth(mapping = aes(x = Sqrt_Time.x, y = Cal_Vol_cm),
                     method = "lm", se = FALSE, 
                     formula = y ~ poly(x, 2, raw = TRUE),
                     color = "red") +
         theme(plot.margin = unit(c(1, 5, 1, 1), "cm")))


    pdf("allplots5.pdf", onefile = TRUE)
    walk(plot_5, print)
    dev.off()

Here is a sample of the Infil_Data2 dataframe that I am using:

Infil_Data2 <-
    structure(list(Time = c(0L, 30L, 60L, 90L, 120L, 150L, 180L, 
    210L, 240L, 270L, 300L, 0L, 30L, 60L, 90L, 120L, 150L, 180L, 
    210L, 240L, 270L, 300L, 0L, 30L, 60L, 90L, 120L, 150L, 180L, 
    210L, 240L, 270L, 300L), Site_ID = c("H1", "H1", "H1", "H1", 
    "H1", "H1", "H1", "H1", "H1", "H1", "H1", "H2", "H2", "H2", "H2", 
    "H2", "H2", "H2", "H2", "H2", "H2", "H2", "H3", "H3", "H3", "H3", 
    "H3", "H3", "H3", "H3", "H3", "H3", "H3"), Vol_mL = c(63, 62, 
    60, 59, 58, 56, 54, 52.5, 50, 48.5, 46.5, 82, 77, 73, 68, 65, 
    51, 56, 52, 47.5, 42.5, 37.5, 69, 67, 65, 63, 61, 60, 58, 56, 
    54, 51.5, 49), Sqrt_Time.x = c(0, 5.477225575, 7.745966692, 9.486832981, 
    10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
    16.43167673, 17.32050808, 0, 5.477225575, 7.745966692, 9.486832981, 
    10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
    16.43167673, 17.32050808, 0, 5.477225575, 7.745966692, 9.486832981, 
    10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
    16.43167673, 17.32050808), Cal_Vol_cm = c(0, 0.124339799, 0.373019398, 
    0.497359197, 0.621698996, 0.870378595, 1.119058194, 1.305567893, 
    1.616417391, 1.80292709, 2.051606688, 0, 0.621698996, 1.119058194, 
    1.74075719, 2.113776588, 3.854533778, 3.232834782, 3.730193979, 
    4.289723076, 4.911422072, 5.533121068, 0, 0.248679599, 0.497359197, 
    0.746038796, 0.994718394, 1.119058194, 1.367737792, 1.616417391, 
    1.865096989, 2.175946488, 2.486795986)), row.names = c(NA, 33L
    ), class = "data.frame")
steveb
  • 5,382
  • 2
  • 27
  • 36
Binx
  • 382
  • 7
  • 22
  • For questions like this, you should provide a small reproducible example; `Infil_Data` is undefined and you haven't provided the libraries you are using (e.g. like the one used for the `walk` command you are calling). Also, `mytable` is defined but not used, is this supposed to be the same as `Infil_Data`? – steveb Jan 23 '19 at 05:37
  • Thanks steveb, is there anything else I need to add or clear up? – Binx Jan 23 '19 at 05:48
  • If a function is used in your code sample, you should include lines like `library(ggplot2)`, `library(purrr)`, etc. If \mytable` isn't used, you should remove it as it is not relevant to the question. If you have another question, then that is where it should go. – steveb Jan 23 '19 at 05:52
  • Okay, I will remove it. I just put it there because it was part of my overall code and was connected to the picture that I provided. Thanks for the comments on how to make my questions better. – Binx Jan 23 '19 at 05:59
  • Can you run `dput(Infil_Data)` and put the results in place of what you have `tibble::tribble(......)`. So you would set `Infil_Data` to the results of `dput`. – steveb Jan 23 '19 at 14:07
  • Please see [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Your question should include code that works making this reproducible. – steveb Jan 23 '19 at 14:15
  • Your code doesn't work as is. – steveb Jan 23 '19 at 14:26
  • Sorry, I have been away for a few days. I reformatted my question and made the changes you suggested. I ran the code in my Rstudio and it works for me. Does it work for you? – Binx Jan 30 '19 at 04:08
  • No worries. Question, your code produces a list of plots, however, only the first element of the list appears to have a plot that produces non empty results. To simplify this question, and make it more reproducible, it would be a good idea to have the minimum number of plots required to answer the questions (2 or 3?). Also, ensuring that all plots actually plot data. – steveb Jan 31 '19 at 18:27
  • Okay, it should work now, producing 3 plots (H1, H2, and H3). Let me know if it does not. – Binx Feb 01 '19 at 17:12
  • I provided an answer which should address what you are looking for. I edited your code to achieve two outcomes: (1) one can cut/paste the code as is and it should work (including the data), (2) fixed the indentation for the call to the plotting code. – steveb Feb 01 '19 at 18:35

1 Answers1

0

Based off of what you did and the Adding Regression Line Equation and R2 on graph post, the code below produces a pdf with one plot per page, and with the equation in the plots. The equations appear in the same relative place on the plot even though the scales are changing. The original code in the question was very close and the code below only adds the call to stat_smooth_func.

# Input data.
Infil_Data2 <-
structure(list(Time = c(0L, 30L, 60L, 90L, 120L, 150L, 180L, 
210L, 240L, 270L, 300L, 0L, 30L, 60L, 90L, 120L, 150L, 180L, 
210L, 240L, 270L, 300L, 0L, 30L, 60L, 90L, 120L, 150L, 180L, 
210L, 240L, 270L, 300L), Site_ID = c("H1", "H1", "H1", "H1", 
"H1", "H1", "H1", "H1", "H1", "H1", "H1", "H2", "H2", "H2", "H2", 
"H2", "H2", "H2", "H2", "H2", "H2", "H2", "H3", "H3", "H3", "H3", 
"H3", "H3", "H3", "H3", "H3", "H3", "H3"), Vol_mL = c(63, 62, 
60, 59, 58, 56, 54, 52.5, 50, 48.5, 46.5, 82, 77, 73, 68, 65, 
51, 56, 52, 47.5, 42.5, 37.5, 69, 67, 65, 63, 61, 60, 58, 56, 
54, 51.5, 49), Sqrt_Time.x = c(0, 5.477225575, 7.745966692, 9.486832981, 
10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
16.43167673, 17.32050808, 0, 5.477225575, 7.745966692, 9.486832981, 
10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
16.43167673, 17.32050808, 0, 5.477225575, 7.745966692, 9.486832981, 
10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
16.43167673, 17.32050808), Cal_Vol_cm = c(0, 0.124339799, 0.373019398, 
0.497359197, 0.621698996, 0.870378595, 1.119058194, 1.305567893, 
1.616417391, 1.80292709, 2.051606688, 0, 0.621698996, 1.119058194, 
1.74075719, 2.113776588, 3.854533778, 3.232834782, 3.730193979, 
4.289723076, 4.911422072, 5.533121068, 0, 0.248679599, 0.497359197, 
0.746038796, 0.994718394, 1.119058194, 1.367737792, 1.616417391, 
1.865096989, 2.175946488, 2.486795986)), row.names = c(NA, 33L
), class = "data.frame")

Plotting code

# For the "stat_smooth_func", use the Laurae package.
# devtools::install_github("Laurae2/Laurae")

library(gridExtra)
library(purrr)
library(tidyverse)
library(Laurae)

plot_5 <-
    Infil_Data2 %>% 
    split(.$Site_ID) %>% 
    map2(names(.),
         ~ggplot(.x, aes(Sqrt_Time.x, Cal_Vol_cm)) + 
         geom_point() +
         labs(title = paste(.y)) +
         theme(plot.title = element_text(hjust = 0.5)) + 
         stat_smooth(mapping = aes(x = Sqrt_Time.x, y = Cal_Vol_cm),
                     method = "lm", se = FALSE, 
                     formula = y ~ poly(x, 2, raw = TRUE),
                     color = "red") +
         theme(plot.margin = unit(c(1, 5, 1, 1), "cm")) +
         stat_smooth_func(geom="text", method = "lm", hjust=0, parse=TRUE))


pdf("allplots5.pdf", onefile = TRUE)
walk(plot_5, print)
dev.off()
steveb
  • 5,382
  • 2
  • 27
  • 36
  • I added the single line, but am getting an error: "geom_text requires the following missing aesthetics: label". – Binx Feb 01 '19 at 19:17
  • A couple of things. Did you cut/paste from my post into your R session, including the setting of `Infil_Data2 ` ? Also, it might be worth restarting R (under menu **Session**, there is the **Restart R** option in RStudio). – steveb Feb 01 '19 at 19:24
  • Yes I cut/paste from your post. What do you mean the setting of Infil_Data2? I also restarted R. – Binx Feb 01 '19 at 19:34
  • See my edit with the `Laurae` package. Apparently I had `stat_smooth_func` in my environment but it wasn't included in my code. If this doesn't resolve your issue, then there is likely something else that differs from our environments. – steveb Feb 01 '19 at 19:44
  • For "setting of Infil_Data2", I mean cut / paste from the post to ensure that `Infil_Data2` matches what is in this post. Essentially I am suggesting starting with an environment that is "clean'ish". – steveb Feb 01 '19 at 19:49
  • I tried adding Laurae from Github, but am running into the error of: "Error in read.dcf(path) : Found continuation line starting ' modeling. ...' at begin of record." – Binx Feb 01 '19 at 19:59
  • What platform are you on, is it Windows ? I googled your error and found a link with a similar issue with a different package, but the workaround may be the same : [Error in read.dcd](https://github.com/psoerensen/qgg/issues/3). Also, what version of R and `devtools` are you using. – steveb Feb 01 '19 at 20:38
  • The following link may address your question [Installing packages onto R](https://stackoverflow.com/questions/16680618/installing-packages-onto-r). My prediction is you are on a Windows machine all search results I looked at are Windows related. – steveb Feb 01 '19 at 20:55
  • I am using windows. I am using 2.0.1 version of devtools and I am on version 1.1.456 of R. I will read though the links you have provided tomorrow and get back to you with further questions if the workaround does not work. Cheers. – Binx Feb 05 '19 at 05:47
  • I also have to add the ability to put higher order polynomials into the equation annotation / text on the plot. – steveb Feb 05 '19 at 18:03
  • Sorry that should have read 3.5.1 version of R. I did just update it to 3.5.2. I am also asking the github community about my problem with Laurae. I will comment here when I eventually get it to work. – Binx Feb 07 '19 at 02:25
  • Hey steveb, I was able to get some help from a user on the R studio community because I could not get yours to work. See comment below. – Binx Feb 11 '19 at 22:50
  • stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~")), label.x.npc = "left", label.y.npc = 0.90, #set the position of the eq formula = y ~ poly(x, 2, raw = TRUE), parse = TRUE, rr.digits = 3)) – Binx Feb 11 '19 at 22:52