0

I'm trying (as hard as i can) to create a script that will generate formatted word documents from plain text files using R language and reporteRs.

To extract text from one txt i'm using this code found on this thread Dealing with readLines() function in R :

fileName <- "C:/MyFolder/TEXT_TO_BE_PROCESSED.txt"
con <- file(fileName,open="r")
line <- readLines(con)
close(con)

Then add the extracted text to docx with this :

doc <- docx(template="temp.docx")

Next, adding the title (first line of the txt file)

doc <- addParagraph( doc, value = line[1], bookmark = "titre", stylename = "Titre")

then the body of the txt file

doc <- addParagraph( doc, value = line[2:length(line)], value = line[2:55], stylename = "Contenu")

Finally I create the docx

writeDoc(doc, file = "output-file.docx")

I want to be able to create a loop so I can generate multiple docx from multiple txt files. I will really appreciate your help

Community
  • 1
  • 1
youyou youcef
  • 13
  • 1
  • 6
  • `line[2:length(line)]` will extract everything from the second line to the end. You need to use something like `lapply` with a vector of filenames to get a loop working – Richard Telford Apr 14 '17 at 14:37
  • Thank you for your reply Richard, I edited the code in the post relative to extracting txt to the end. I'm not sure to know how to use `lapply` but I'll try to look in the documentation. Can you help if you know ? – youyou youcef Apr 14 '17 at 14:58
  • HELP NEEDED !!! – youyou youcef Apr 14 '17 at 17:49

2 Answers2

0

You can do something like this with lapply

myFiles <- c("C:/MyFolder/TEXT_TO_BE_PROCESSED.txt", "C:/MyFolder/TEXT_TO_BE_PROCESSED2.txt") # or use list.files()

lapply(myFiles, function(fileName){
  con <- file(fileName,open="r")
  line <- readLines(con) # you could just call readLines(fileName)
  close(con)
  doc <- docx(template="temp.docx")
  doc <- addParagraph( doc, value = line[1], bookmark = "titre", stylename = "Titre")
  doc <- addParagraph( doc, value = line[2:length(line)], value = line[2:55], stylename = "Contenu")
  writeDoc(doc, file = paste0(fileName, "out.docx"))
})
Richard Telford
  • 9,558
  • 6
  • 38
  • 51
0

The solution :

myFiles <- list.files()

lapply(myFiles, function(fileName){
line <- readLines(fileName)
doc <- docx(template="temp.docx")
doc <- addParagraph( doc, value = line[1], bookmark = "titre", 
stylename = "Titre »)
doc <- addParagraph( doc, value = line[2:length(line)], stylename = "Contenu")
writeDoc(doc, file = paste0(fileName, ".docx"))
})

Thank you again Richard

youyou youcef
  • 13
  • 1
  • 6