4

I have a dataframe like follows:

library(dplyr)
mydf <- data_frame(headline = c('this is the first news',
                                'this is the second news'),
                   fulltext = c('Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum',
                                'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum'))

essentially, I would like to create a document (pdf, html, whatever) that for each article, prints the headline followed by the 100 first characters from the fulltext column.

Something like

-- start of html/pdf output

this is the first news

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident

this is the second news

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident

-- end of html/pdf output

How can I do that with knitr?

Jaap
  • 81,064
  • 34
  • 182
  • 193
ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235
  • Related: [how to create a loop that includes both a code chunk and text with knitr in R](https://stackoverflow.com/questions/36373630/how-to-create-a-loop-that-includes-both-a-code-chunk-and-text-with-knitr-in-r); [Create parametric R markdown documentation?](https://stackoverflow.com/questions/14959312/create-parametric-r-markdown-documentation) – Henrik Oct 23 '17 at 21:25

2 Answers2

4

This can be done with a combination of a for loop, cat, and the chunk option results = 'asis'

---
title: "Untitled"
output: html_document
---

```{r, include = FALSE}
library(dplyr)
mydf <- data_frame(headline = c('this is the first news',
                                'this is the second news'),
                   fulltext = c('Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum',
                                'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum'))
```

```{r, echo = FALSE, results = 'asis'}
for (i in seq_len(nrow(mydf))){
  cat(paste0("**", mydf[["headline"]][i], "**"))
  cat("\n\n")
  cat(
    paste0(
      gsub("\\n", "\n\n", substr(mydf[["fulltext"]][i], 1, 100), "...")
    )
  )
  cat("\n\n")
}
```
Benjamin
  • 16,897
  • 6
  • 45
  • 65
  • very nice! but this solution does not seem to take into account any possible `\n` included in the full text. do you see some easy fix here? thanks! – ℕʘʘḆḽḘ Oct 23 '17 at 18:30
  • 1
    What do you want to do with the `\n` in `fulltext`? Should that create a new paragraph, or be ignored? – Benjamin Oct 23 '17 at 18:32
  • ideally, new paragraph. OBEY what is written in the full text that is :D – ℕʘʘḆḽḘ Oct 23 '17 at 18:33
  • 1
    Edited the answer: It's crude, but it replaces any instance of `\n` with `\n\n` (you need two line breaks to make a new paragraph in markdown. If this creates excessive line breaks in the markdown, it should get ignored as unnecessary whitespace). – Benjamin Oct 23 '17 at 18:36
2

You really want to use the knitr for some reason? A poor way to do this:

for(i in 1:nrow(mydf)){
  temp=(paste(mydf[i,1],'<br>','\n',mydf[i,2]))
  write.table(temp,paste(i,'.txt'), row.names = F,col.names = F)
  knit(paste(i,'.txt'),paste(i,'.html'))
}