How can I read a Microsoft .docx file in R and get the text as one field and page number as another?
From the readtext R libraries, I can read the text, but wondering if you know how to get the page number as well?
install.packages("readtext")
library(readtext)
doc <- readtext(system.file("examples/realworld.docx", package="docxtractr"))
So the desired output should be
text page_number
text from page 1 1
text from page 2 2
Please advise.