76

I would like to create an automated knitr report that will produce histograms for each numeric field within my dataframe. My goal is to do this without having to specify the actual fields (this dataset contains over 70 and I would also like to reuse the script).

I've tried a few different approaches:

  • saving the plot to an object, p, and then calling p after the loop
    • This only plots the final plot
  • Creating an array of plots, PLOTS <- NULL, and appending the plots within the loop PLOTS <- append(PLOTS, p)
    • Accessing these plots out of the loop did not work at all
  • Even tried saving each to a .png file but would rather not have to deal with the overhead of saving and then re-accessing each file

I'm afraid the intricacies of the plot devices are escaping me.

Question

How can I make the following chunk output each plot within the loop to the report? Currently, the best I can achieve is output of the final plot produced by saving it to an object and calling that object outside of the loop.

R markdown chunk using knitr in RStudio:

```{r plotNumeric, echo=TRUE, fig.height=3}
suppressPackageStartupMessages(library(ggplot2))
FIELDS <- names(df)[sapply(df, class)=="numeric"]
for (field in  FIELDS){
  qplot(df[,field], main=field)  
}
```

From this point, I hope to customize the plots further.

Community
  • 1
  • 1
bnjmn
  • 4,508
  • 4
  • 37
  • 52
  • Yes. I must admit I'm new to it... – bnjmn Aug 14 '12 at 17:09
  • I've added the knitr tag to your question, and to the title, to make it clear that's what you're using. – David Robinson Aug 14 '12 at 17:13
  • 9
    Did you just forget to wrap the `qplot` in `print`? `knitr` will do that for you if the `qplot` is outside a loop, but (at least the version I have installed) doesn't detect this inside the loop (which is consistent with the behaviour of the R command line). – cbeleites unhappy with SX Aug 14 '12 at 17:37
  • 1
    @cbeleites You should probably make an answer of that, so OP can accept it. – sebastian-c Aug 15 '12 at 01:07
  • Having a similar problem. Trying to loop analyses and a ggplot figure into an .Rmd file. But after the loop runs no figures or analyses outputs print. Any ideas why? – I Del Toro Jan 27 '16 at 13:57

5 Answers5

63

Wrap the qplot in print.

knitr will do that for you if the qplot is outside a loop, but (at least the version I have installed) doesn't detect this inside the loop (which is consistent with the behaviour of the R command line).

bnjmn
  • 4,508
  • 4
  • 37
  • 52
cbeleites unhappy with SX
  • 13,717
  • 5
  • 45
  • 57
  • 1
    Having a similar problem. Trying to loop analyses and a ggplot figure into an .Rmd file. But after the loop runs no figures or analyses outputs print. Any ideas why? – I Del Toro Jan 27 '16 at 14:11
  • this works quite well, save for the fact that when I print the charts they are out of order. Any idea what might be going on there – Skyler Aug 18 '19 at 02:49
  • 1
    Can anyone explain why you have to wrap the whole thing rather than just piping to print at the end? eg ggplot(......) %>% print() – jzadra Nov 22 '19 at 01:02
  • 1
    @jzadra: you can do that as well. You'll still need to wrap (ggplot() + geom_* ()) in parentheses for the piping to work correctly - otherwise you'll only get a print of the last object you add tot he ggplot because of operator precedence. (As for why isn't this in the answer: Pipes weren't around yet when I wrote that answer). – cbeleites unhappy with SX Nov 22 '19 at 12:35
  • @cbeleitessupportsMonica Thanks. Man I wish Hadley would re-write ggplot3 to use pipes. – jzadra Nov 22 '19 at 21:43
  • Doesn't work if we use `dygraph()` R function instead of `qplot`. – Lazarus Thurston Mar 21 '22 at 19:10
  • The `lapply` and `tagList` approach solves this in a more general and robust way. See the solution here. https://stackoverflow.com/a/35235069/1972786 – Lazarus Thurston Mar 21 '22 at 19:27
27

Wish to add a quick note: Somehow I googled the same question and get into this page. Now in 2018, just use print() in the loop.

for (i in 1:n){
...
    f <- ggplot(.......)
    print(f)
}
Yang Liu
  • 521
  • 6
  • 4
  • 1
    This is literally what *"wrap `qplot` in `print`"* means, i.e., this just restates the accepted answer. – merv Jan 11 '19 at 20:18
  • 12
    Seeing the code is helpful as it is direct and makes the abstract visually obvious. – Peter May 24 '20 at 10:36
  • 3
    This makes sense to a beginner. The accepted answer does not. Therefore this answer is better. – monkey Jun 24 '22 at 03:56
11

I am using child Rmd files in markdown, also works in sweave.

in Rmd use following snippet:

```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
  out = c(out, knit_child('da-numeric.Rmd'))
}
```

da-numeric.Rmd looks like:

Variabele `r num_var_names[i]`
------------------------------------

Missing :  `r sum(is.na(data[[num_var_names[i]]]))`  
Minimum value : `r min(na.omit(data[[num_var_names[i]]]))`  
Percentile 1 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2]`  
Percentile 99 : `r quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]`  
Maximum value : `r max(na.omit(data[[num_var_names[i]]]))`  

```{r results='asis', comment="" }
warn_extreme_values=3
d1 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[2] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[1]
d99 = quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[101] > warn_extreme_values*quantile(na.omit(data[[num_var_names[i]]]),probs = seq(0, 1, 0.01))[100]
if(d1){cat('Warning : Suspect extreme values in left tail')}
if(d99){cat('Warning : Suspect extreme values in right tail')}
```

``` {r eval=TRUE,  fig.width=6, fig.height=2}
library(ggplot2)

v <- num_var_names[i]
hp <- ggplot(na.omit(data), aes_string(x=v)) + geom_histogram( colour="grey", fill="grey", binwidth=diff(range(na.omit(data[[v]]))/100))

hp + theme(axis.title.x = element_blank(),axis.text.x = element_text(size=10)) + theme(axis.title.y = element_blank(),axis.text.y = element_text(size=10))

```

see my datamineR package on github https://github.com/hugokoopmans/dataMineR

Hugo Koopmans
  • 1,349
  • 1
  • 15
  • 27
  • Hey :) Where does the knit_child() function come from? – Fabian Jul 06 '16 at 07:16
  • This gives more flexibility to create figure captions and add other details than the accepted answer. You can use the options "fig.caption" and changing the chunk names for each iteration of the loop. – Ricecakes Feb 20 '19 at 19:38
  • 1
    Would it be possible to reference a knit label instead of a separate file? – user3072843 Mar 12 '20 at 20:12
3

As an addition to Hugo's excellent answer, I believe that in 2016 you need to include a print command as well:

```{r run-numeric-md, include=FALSE}
out = NULL
for (i in c(1:num_vars)) {
  out = c(out, knit_child('da-numeric.Rmd'))
}

`r paste(out, collapse = '\n')`
```
Alex
  • 15,186
  • 15
  • 73
  • 127
0

For knitting Rmd to HTML, I find it more convenient to have a list of figures. In this case I get the desirable output with results='hide' as follows:

---
title: "Make a list of figures and show it"
output: 
  html_document
---


```{r}
suppressPackageStartupMessages({
  library(ggplot2)
  library(dplyr)
  requireNamespace("scater")
  requireNamespace("SingleCellExperiment")
})
```


```{r}
plots <- function() {
  print("print")
  cat("cat")
  message("message")
  warning("warning")
  
  # These calls generate unwanted text
  scater::mockSCE(ngene = 77, ncells = 33) %>%
    scater::logNormCounts() %>%
    scater::runPCA() %>%
    SingleCellExperiment::reducedDim("PCA") %>%
    as.data.frame() %>%
    {
      list(
        f12 = ggplot(., aes(x = PC1, y = PC2)) + geom_point(),
        f22 = ggplot(., aes(x = PC2, y = PC3)) + geom_point()
      )
    }
}
```

```{r, message=FALSE, warning=TRUE, results='hide'}
plots()
```

Only the plots are shown and the warnings (which you can switch off, as well).

user66081
  • 420
  • 7
  • 15