2

I would like to create several pdf files in rmarkdown.

This is a sample of my data:

mydata <- data.frame(First = c("John", "Hui", "Jared","Jenner"), Second = c("Smith", "Chang", "Jzu","King"), Sport = c("Football","Ballet","Ballet","Football"), Age = c("12", "13", "12","13"), submission = c("Microbes may be the friends of future colonists living off the land on the moon, Mars or elsewhere in the solar system and aiming to establish self-sufficient homes.

Space colonists, like people on Earth, will need what are known as rare earth elements, which are critical to modern technologies. These 17 elements, with daunting names like yttrium, lanthanum, neodymium and gadolinium, are sparsely distributed in the Earth’s crust. Without the rare earths, we wouldn’t have certain lasers, metallic alloys and powerful magnets that are used in cellphones and electric cars.", "But mining them on Earth today is an arduous process. It requires crushing tons of ore and then extracting smidgens of these metals using chemicals that leave behind rivers of toxic waste water.

Experiments conducted aboard the International Space Station show that a potentially cleaner, more efficient method could work on other worlds: let bacteria do the messy work of separating rare earth elements from rock.", "“The idea is the biology is essentially catalyzing a reaction that would occur very slowly without the biology,” said Charles S. Cockell, a professor of astrobiology at the University of Edinburgh.

On Earth, such biomining techniques are already used to produce 10 to 20 percent of the world’s copper and also at some gold mines; scientists have identified microbes that help leach rare earth elements out of rocks.", "Blank"))

With help from the community, I was able to arrive at a cool rmarkdown solution that would create a single html file, with all the data I want.

This is saved as Essay to Word.Rmd

```{r echo = FALSE}
# using data from above
# mydata <- mydata

# Define template (using column names from data.frame)
template <- "**First:** `r First` &emsp; **Second:**  `r Second` <br>
**Age:** `r Age`    


**Submission** <br>



`r Submission`"



# Now process the template for each row of the data.frame
src <- lapply(1:nrow(mydata), function(i) {
  knitr::knit_child(text=template, envir=mydata[i, ], quiet=TRUE)
})

```
# Print result to document
`r knitr::knit_child(text=unlist(src))`

```

This creates a single file:

enter image description here


I would like to create a single html (or preferably PDF file) for each "sport" listed in the data. So I would have all the submissions for students who do "Ballet" in one file, and a separate file with all the submissions of students who play football.

I have been looking a few different solutions, and I found this to be the most helpful: R Knitr PDF: Is there a posssibility to automatically save PDF reports (generated from .Rmd) through a loop?

Following suite, I created a separate R script to loop through and subset the data by sport: Unfortunately, this is creating a separate file with ALL the students, not just those who belong to that sport.

 for (sport in unique(mydata$Sport)){
  subgroup <- mydata[mydata$Sport == sport,]
  render("Essay to Word.Rmd",output_file = paste0('report.',sport, '.html'))    
}
  1. Any idea what might be going on with this code above?
  2. Is it possible to directly create these files as PDF docs instead of html? I know I can click on each file to save them as pdf after the fact, but I will have 40 different sports files to work with.
  3. Is is possible to add a thin line between each "submission" essay within a file?

Any help would be great, thank you!!!

NewBee
  • 990
  • 1
  • 7
  • 26

2 Answers2

1

In order to directly create a pdf from your rmd-file , you could use the following function in a separate R script where your data is loaded, and then use map from the purrr package to iterate over the data (in the rmd-file the output must be set to pdf_document):

library(tidyverse)
library(lazyeval)

    get_report <- function(sport){
      sport <- enquo(sport)
      mydata <- mydata %>%  
        filter(Sport == !!sport)
      render("test.rmd", output_file = paste('report_', as_name(sport), '.pdf', sep=''))
    }
    
    map(as.vector(data$Sport), get_report)

Hope that is what you are looking for?

LaR
  • 11
  • 2
1

This could be achieved via a parametrized report like so:

  1. Add parameters for the data and e.g. the type of sport to your Rmd
  2. Inside the lapply pass your subgroup dataset to render via argument params
  3. You can add horizontal lines via ***
  4. If you want pdf then use output_format="pdf_document". Additionally to render your document I had to switch the latex engine via output_options

Rmd:

---
params:
  data: null
  sport: null
---

```{r echo = FALSE}
# using data from above
data <- params$data

# Define template (using column names from data.frame)
template <- "
***

**First:** `r First` &emsp; **Second:**  `r Second` <br>
**Age:** `r Age`    

**Submission** <br>

`r Submission`"

# Now process the template for each row of the data.frame
src <- lapply(1:nrow(data), function(i) {
  knitr::knit_child(text=template, envir=data[i, ], quiet=TRUE)
})
```
# Print result to document. Sport: `r params$sport`
`r knitr::knit_child(text=unlist(src))`

R Script:

mydata <- data.frame(First = c("John", "Hui", "Jared","Jenner"), 
           Second = c("Smith", "Chang", "Jzu","King"), 
           Sport = c("Football","Ballet","Ballet","Football"), 
           Age = c("12", "13", "12","13"), 
           Submission = c("Microbes may be the friends of future colonists living off the land on the moon, Mars or elsewhere in the solar system and aiming to establish self-sufficient homes.

Space colonists, like people on Earth, will need what are known as rare earth elements, which are critical to modern technologies. These 17 elements, with daunting names like yttrium, lanthanum, neodymium and gadolinium, are sparsely distributed in the Earth’s crust. Without the rare earths, we wouldn’t have certain lasers, metallic alloys and powerful magnets that are used in cellphones and electric cars.", "But mining them on Earth today is an arduous process. It requires crushing tons of ore and then extracting smidgens of these metals using chemicals that leave behind rivers of toxic waste water.

Experiments conducted aboard the International Space Station show that a potentially cleaner, more efficient method could work on other worlds: let bacteria do the messy work of separating rare earth elements from rock.", "“The idea is the biology is essentially catalyzing a reaction that would occur very slowly without the biology,” said Charles S. Cockell, a professor of astrobiology at the University of Edinburgh.

On Earth, such biomining techniques are already used to produce 10 to 20 percent of the world’s copper and also at some gold mines; scientists have identified microbes that help leach rare earth elements out of rocks.", "Blank"))

for (sport in unique(mydata$Sport)){
  subgroup <- mydata[mydata$Sport == sport,]
  rmarkdown::render("test.Rmd", output_format = "html_document", output_file = paste0('report.', sport, '.html'), params = list(data = subgroup, sport = sport))
  rmarkdown::render("test.Rmd", output_format = "pdf_document", output_options = list(latex_engine = "xelatex"), output_file = paste0('report.', sport, '.pdf'), params = list(data = subgroup, sport = sport))
}

enter image description here

stefan
  • 90,330
  • 6
  • 25
  • 51
  • Thank you for your answer! I've adjusted your code to produce it in a word format, and it looks great! Is it possible to add a page break between each essay instead of a line? I tried \newpage but it does not work.. https://bookdown.org/yihui/rmarkdown-cookbook/pagebreaks.html – NewBee Nov 16 '20 at 03:57
  • Hm. Have you used or tried with a double backslash, i.e. `\\newpage`? Doing so worked fine for me. – stefan Nov 16 '20 at 10:50