Say I have a tall dataframe with many rows per group, like so:
df <- data.frame(group = factor(rep(c("a","b","c"), each = 5)),
v1 = sample(1:100, 15, replace = TRUE),
v2 = sample(1:100, 15, replace = TRUE),
v3 = sample(1:100, 15, replace = TRUE))
What I want to do is split df
into length(levels(df$group))
separate dataframes, e.g.,
df_a <- df[df$group=="a",]; df_b <- df[df$group == "b",] ; ...
And then print each dataframe in a separate HTML/PDF/DOCX file (probably using Rmarkdown
and knitr
).
I want to do this because I have a large dataframe and want to create a personalized report for each group a
, b
, c
, etc. Thanks.
Update (11/18/14)
Following @daroczig 's advice in this thread and another thread, I attempted to make my own template that would simply print a nicely formatted table of all columns and rows per group to substitute into the "correlations"
template call in the original sapply()
function. I want to make my own template rather than just printing the nice table (e.g., the answer @Thomas graciously provided) because I'd like to build additional customization into the template once the simple printing works. Anyway, I've certainly butchered it:
<!--head
meta:
title: Sample Report
author: Nicapyke
description: This is a demo
packages: ~
inputs:
- name: eachgroup
class: character
standalone: TRUE
required: TRUE
head-->
### Records received up to present for Group <%= eachgroup %>
<%=
pandoc.table(df[df$group == eachgroup, ])
%>
Then, after saving that as groupreport.rapport
in my working directory, I wrote the following R code, modeled after @daroczig's response:
allgroups <- unique(df$group)
library(rapport)
for (eachstate in allstates) {
rapport.docx("FILEPATHHERE", eachgroup = eachgroup)
}
I received the error:
Error in openFileInOS(f.out) : File not found!
I'm not sure what happened. I see from the pander
documentation that this means it's looking for a system file, but that doesn't mean much to me. Anyway, this error doesn't get at the root of the problem, which is 1) what should go in the input
section of the custom template YAML
header, and 2) which R code should go in the rapport
template vs. in the R script.
I realize I may be making a number of errors that reveal my lack of experience with rapport
and pander
. Thanks for your patience!
N.B.:
> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.8 dplyr_0.3.0.2 rapport_0.51 yaml_2.1.13 pander_0.5.1
plyr_1.8.1 lattice_0.20-29
loaded via a namespace (and not attached):
[1] assertthat_0.1 DBI_0.3.1 digest_0.6.4 evaluate_0.5.5 formatR_1.0 grid_3.1.2
[7] lazyeval_0.1.9 magrittr_1.0.1 parallel_3.1.2 Rcpp_0.11.3 reshape_0.8.5 stringr_0.6.2
[13] tools_3.1.2