51

I'm trying to use knitr to generate a report that performs the same set of analyses on different subsets of a data set. The project contains two Rmd files: the first file is a master document that sets up the workspace and the document, the second file only contains chunks that perform the analyses and generates associated figures.

What I would like to do is knit the master file, which would then call the second file for each data subset and include the results in a single document. Below is a simple example.

Master document:

# My report

```{r}
library(iterators)
data(mtcars)
```

```{r create-iterator}
cyl.i <- iter(unique(mtcars$cyl))
```

## Generate report for each level of cylinder variable
```{r cyl4-report, child='analysis-template.Rmd'}
```

```{r cyl6-report, child='analysis-template.Rmd'}
```

```{r cyl8-report, child='analysis-template.Rmd'}
```

analysis-template.Rmd:

```{r, results='asis'}
cur.cyl <- nextElem(cyl.i)
cat("###", cur.cyl)
```

```{r mpg-histogram}
hist(mtcars$mpg[mtcars$cyl == cur.cyl], main = paste(cur.cyl, "cylinders"))
```

```{r weight-histogam}
hist(mtcars$wt[mtcars$cyl == cur.cyl], main = paste(cur.cyl, "cylinders"))
```

The problem is knitr does not allow for non-unique chunk labels, so knitting fails when analysis-template.Rmd is called the second time. This problem could be avoided by leaving the chunks unnamed since unique labels would then be automatically generated. This isn't ideal, however, because I'd like to use the chunk labels to create informative filenames for the exported plots.


A potential solution would be using a simple function that appends the current cylinder to the chunk label:

```r{paste('cur-label', cyl, sep = "-")}
```

But it doesn't appear that knitr will evaluate an expression in the chunk label position.


I also tried using a custom chunk hook that modified the current chunk's label:

knit_hooks$set(cyl.suffix = function(before, options, envir) {
    if (before) options$label <- "new-label"
})

But changing the chunk label didn't affect the filenames for generated plots, so I didn't think knitr was utilizing the new label.


Any ideas on how to change chunk labels so the same child document can be called multiple times? Or perhaps an alternative strategy to accomplish this?

aaronwolen
  • 3,723
  • 1
  • 20
  • 21

4 Answers4

49

For anyone else who comes across this post, I wanted to point out that @Yihui has provided a formal solution to this question in knitr 1.0 with the introduction of the knit_expand() function. It works great and has really simplified my workflow.

For example, the following will process the template script below for every level of mtcars$cyl, each time replacing all instances of {{ncyl}} (in the template) with its current value:

# My report

```{r}
data(mtcars)
cyl.levels <- unique(mtcars$cyl)
```

## Generate report for each level of cylinder variable
```{r, include=FALSE}
src <- lapply(cyl.levels, function(ncyl) knit_expand(file = "template.Rmd"))
```

`r knit(text = unlist(src))`

Template:

```{r, results='asis'}
cat("### {{ncyl}} cylinders")
```

```{r mpg-histogram-{{ncyl}}cyl}
hist(mtcars$mpg[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```

```{r weight-histogam-{{ncyl}}cyl}
hist(mtcars$wt[mtcars$cyl == {{ncyl}}], 
  main = paste({{ncyl}}, "cylinders"))
```
aaronwolen
  • 3,723
  • 1
  • 20
  • 21
  • I have used this approach but noticed that using `echo=FALSE` in the template results in the code not being processed. Have you noticed the same behaviour? – Samuel-Rosa Jan 03 '18 at 22:25
  • **Edit**: I have used this approach along with the **bookdown** package and noticed that using `results='asis'` and `echo=FALSE` in the template results in the code not being processed. The solution is to have each output in a separate code chunk. – Samuel-Rosa Jan 04 '18 at 11:42
15

If you make all chunks in your ** nameless, i.e. ```{r} it works. This, of course, is not very elegant, but there are two issues preventing you from changing the label of the current chunk:

  1. A file is parsed before the code blocks are executed. The parser already detects duplicate labels, before any code is executed or custom hooks are called.
  2. The chunk options (inc. the label) are processed before the hook is called (logical: it's an option that triggers a hook), so the hook cannot change the label anymore.

The fact that unnamed blocks work is that internally they get the label unnamed-chunk-+chunk number.

Blocks cannot have duplicate names as internally knitr references them by label. A fix could be to make knitr add the chunk number to all chunks with duplicate names. Or to reference them by chunk number instead of label, but that seems to me a much bigger change.

ROLO
  • 4,183
  • 25
  • 41
  • 7
    Your understanding is absolutely correct, and this is a convincing case that knitr needs some changes. I'm looking at your pull request now. Thanks! – Yihui Xie Aug 23 '12 at 20:58
  • @Rolo, your explanation of knitr's inner-workings was super helpful. And I really appreciate your taking the time to write [the code implementing your solution](https://github.com/yihui/knitr/issues/368). @Yihui, do you think you'll include this change? It would address 90% of what I was trying to accomplish and would make it unnecessary for me to maintain copies of Rmd files that are identical except for the modified chunk labels. The ideal solution would allow for something like `for(i in unique(mtcars$cyl)) knit_child("analysis-template.Rmd", label.suffix = i)`, if it were plausible. – aaronwolen Aug 24 '12 at 16:00
  • 1
    Yes I think I tend to accept the pull request; just give me a few more minutes because I still have a couple of alternative solutions. It is easy to solve this problem but it is hard to decide which solution to use. – Yihui Xie Aug 24 '12 at 17:10
1

There is a similar question posed here I was able to programmatically create r chunks and knit the outputs for use in a flexdashboard (quite useful) based on an arbitrary list of input plots using the knit_expand(text=) and r paste(knitr::knit(text = paste(out, collapse = '\n'))) methods.

0

The question and the answers in this post allowed me to solve an analogous problem: how to produce a report exposing several paired tables (with label, caption and subcaption) without replicating the chunk options and code for each.
My solution, using knit_expand(), follows.

The template: PrintTab.Rmd

```{r, results='asis'}
#| label: tbl-{{tab}}
#| tbl-cap: "This is {{tab}}"
#| tbl-subcap:
#|   - "data"
#|   - "structure"

tbl <- {{tab}}

kbl(tbl)

tibble(attribute = names(tbl)) |> 
  unnest(cols = c(attribute)) |> 
  mutate(kind = typeof(attribute)) |> 
  kbl()
```

main.qmd

---
format: pdf
---

```{r}
library(tidyverse)
library(knitr)
library(kableExtra)
x <- c(3, 1, 2)
Table1 <- data.frame(A = x, B = letters[x])
x <- c(11, 16, 21)
Table2 <- data.frame(C = x, D = x/2)

```

`r knit(text = knit_expand(file = "PrintTab.Rmd", tab = "Table1"))`

`r knit(text = knit_expand(file = "PrintTab.Rmd", tab = "Table2"))`
Roberto Scotti
  • 121
  • 1
  • 8