0

What is the easiest way to create Word document in a loop? I'm fairly new to R-markdown and working with texts in R, so I'm hoping there is easy way to do below:

I have a dataset with users and for each of them I have to create separate page/document. e.g. users:

df <- data.frame(name = c("Amy", "Bob", "Chloe", "Dan"),  
                 age = c(20, 35, 26, 41),  
                 country = c("USA", "UK", "FR", "AU")) 

I have also pre-defined texts:

text <- c("Name: ",
          "Age: ",
          "Country: ")

I wonder if there is any easy way to loop through Names (rows) in df to produce Word page (not in table) per each person like below:

Name: Amy
Age: 20
Country: USA

I've tried using R markdown solution like here Use loop to generate section of text in rmarkdown and it works until I add officer library (don't know why it doesnt work on my pc no mater what), so I've tried something like below:

for (i in seq(nrow(df))){
current <- df[i,]
current_value <- c(current$name,current$age, current$country)
df_text <- data.frame(text, current_value)
cat("\n\\pagebreak\n")
}

but the output is in one line, there is df_text header and row numbers. I want later to work on text formatting so I wonder if there is an easy way to do this? (the real data may by up to 100 docx pages/files).

1 Answers1

0

Files

1. configure.R

## @knitr configure

# --- Be very, very, very quiet loading these packages...
suppressPackageStartupMessages( suppressWarnings( require( purrr     )))
suppressPackageStartupMessages( suppressWarnings( require( rmarkdown )))

# --- Convenience function to create templates
to_file <- function( filespec, what ){
  connection <- file(
      description = filespec
    , open = "wt"
  )
  write( what, connection )
  close( connection )
}

2. print_asis.R

This is for the MS-Word pagebreak.

# Reference:
#  11.17 Customize the printing of objects in chunks (*)
#  https://bookdown.org/yihui/rmarkdown-cookbook/opts-render.html#opts-render

# Three elements are needed in order to print something ** as is **
# within a chunk (when everything else is formatted by knitr):

# NOTE the following:
# 1. `knit_print`, `knitr` and `asis_output` are all defined in the `knitr` package,
#    so don't mess with them.
# 2. `PRINT_ASIS`, `CONTROL`, and `FORMFEED` can be whatever you want them to be,
#    just remember, if you change them to something else, you make those changes
#    everywhere they appear.

## @knitr print_as_is

# --- (1) Define a `knit_print` method
PRINT_ASIS <- function( x, ... ){
  knitr::asis_output( x )
}

# --- (2) Register the method defined above
registerS3method(
    genname = "knit_print"                 # Generic name
  , class   = "CONTROL"                    # For which class of object?
  , method  = PRINT_ASIS                   # The custom method for this class
  , envir   = asNamespace( "knitr" )
)

# --- (3) Set the class of whatever is to be printed
FORMFEED <- '\\newpage'
class( FORMFEED ) <- "CONTROL"

3. data.csv

Name,Age,Country
Amy,20,USA
Bob,35,UK
Chloe,26,FR
Dan,41,AU

4. Multipage_MSWord_Doc.Rmd

The contents of this file are described in five pieces (a YAML header, plus four chunks) below:

How it works

The RMarkdown template,
Multipage_MSWord_Doc.Rmd
takes input from the
CONFIG_FILE and the DATA_FILE,
then creates a PAGE_TEMPLATE and a BODY_TEMPLATE.

The PAGE_TEMPLATE uses the headings from the DATA_FILE, so feel free to add fields or change the names of the fields.

The YAML header

Note that you must edit the PATH in the header, to point to the directory where you have saved the data and configuration files.

---
output: word_document
params:
  PATH:          'path/to/your/MultipageDoc'    # <------- EDIT THIS
  CONFIG_FILE:   'configure.R'
  DATA_FILE:     'data.csv'
  PAGE_TEMPLATE: 'AUTOGENERATED_page_template.R'
  BODY_TEMPLATE: 'AUTOGENERATED_body_template.R'
---

The first chunk

The first chunk locates all the files and reads the input data

```{r STEP1_CONFIGURE, eval=TRUE,echo=FALSE}
# --- Use these labels to refer to the various files
CONFIG_FILE    <- file.path( params$PATH, params$CONFIG_FILE    )
PAGE_TEMPLATE  <- file.path( params$PATH, params$PAGE_TEMPLATE  )
BODY_TEMPLATE  <- file.path( params$PATH, params$BODY_TEMPLATE  )
DATA_FILE      <- file.path( params$PATH, params$DATA_FILE  )

# --- Get the special sauce that prints MS-Word pagebreaks
knitr::read_chunk( 'print_asis.R'     )

# --- The configuration file must exist
stopifnot( file.exists( CONFIG_FILE   ))
knitr::read_chunk( CONFIG_FILE )

# --- The data file must exist
stopifnot( file.exists( DATA_FILE   ))
THE_DATA <- read.csv( DATA_FILE)
i <- 0    # Initialize the record/page counter
```

The second chunk

The second chunk generates the PAGE_TEMPLATE, which controls which data are on the page and how the data are formatted.

```{r STEP2_PAGE_TEMPLATE, eval=TRUE,echo=FALSE}

# --- Evaluate the <<configure>> chunk

<<configure>>

# --- Find out what this chunk is called and what the .Rmd file name is, as well
#     so we can put a comment in the templates that we generate. This way,
#     when you want to change a template, you can see where to find the code that
#     made it.

THIS_CHUNK <- knitr::opts_current$get()$label
THIS_FILE  <- knitr::current_input()

# --- Define the page template

stuff_to_put_in_page_template <- list(
    chunk_start = "## @knitr page_template\n"
  , counter = 'i <- 1 + i'
  , data = names( THE_DATA ) %>%
      purrr::map_chr(
        ~sprintf(
            "cat( sprintf( '%s: %%s', THE_DATA[ i, '%s' ] ))"
          ,                 .x
          ,                                         .x
         )
      )
  , notice = sprintf(
       "\n# NOTE: This file is generated by %s of %s\n#       Do not edit manually!\n"
     ,                              THIS_CHUNK
     ,                                     THIS_FILE
    )
)
stuff_to_put_in_page_template <- unlist( stuff_to_put_in_page_template )

# --- Write the page template to disk
to_file( PAGE_TEMPLATE, stuff_to_put_in_page_template )

# --- Make sure it gets there
stopifnot( file.exists( PAGE_TEMPLATE ))

# --- Now read it back as a chunk
knitr::read_chunk( PAGE_TEMPLATE )
```

The third chunk

The third chunk generates the BODY_TEMPLATE, which glues the PAGEs together with a FORMFEED.

```{r STEP3_BODY_TEMPLATE, eval=TRUE,echo=FALSE}

# --- Find out what this chunk is called
#     so we can put a comment in the templates that we generate. This way,
#     when you want to change a template, you can see where to find the code that
#     made it.

THIS_CHUNK <- knitr::opts_current$get()$label

# --- Glue the pages together with form feeds
first_item    <- "<<page_template>>"
repeated_part <- "FORMFEED\n<<page_template>>"

# --- Define the body template

stuff_to_put_in_contents_chunk <- list(
    chunk_start = "## @knitr contents\n"
  , data = c( first_item, rep( repeated_part, nrow( THE_DATA ) - 1 ))
  , notice = sprintf(
       "\n# NOTE: This file is generated by %s of %s\n#       Do not edit manually!\n"
     ,                              THIS_CHUNK
     ,                                     THIS_FILE
    )
)
stuff_to_put_in_contents_chunk <- unlist( stuff_to_put_in_contents_chunk )

# --- Write the body template to disk
to_file( BODY_TEMPLATE, stuff_to_put_in_contents_chunk )

# --- Make sure it gets there
stopifnot( file.exists( BODY_TEMPLATE ))

# --- Now read it back as a chunk
knitr::read_chunk( BODY_TEMPLATE )

# --- Evaluate the <<print_as_is>> chunk
<<print_as_is>>
```

The fourth chunk

The fourth chunk generates the puts it all together, and renders the multipage document. Note that this chunk uses the comment option, as in comment=''. This is to over-ride the default formatting, which puts two pound-signs at the beginning of each line of output.


```{r STEP4_RENDER_DOCUMENT, eval=TRUE,echo=FALSE,comment=''}
# ,comment='' to prevent default ## at the beginning of each line of output

# --- Evaluate the <<contents>> chunk, populating the body of the multipage document.

<<contents>>
```

Chunk names

Wherever you see ## @knitr something, this is the name of a chunk -- that is, a way to reference everything that follows (until the next chunk, anyway)

Wherever you see <<something>>, the chunk called something is being referenced.

Invoke from command-line

Invoke knitr from the command line using
Rscript -e rmarkdown::render"('Multipage_MSWord_Doc.Rmd',output_file = 'Multipage.doc')"

Karl Edwards
  • 305
  • 4