There are many ways to do what you want, but without knowing more about your case or your example it is difficult to come up with the right solution.
If you are SURE that there will only be one instance of Disclosure and one instance of Conclusion you can use the following. Also, be warned, this assumes that each document is a single content vector and will not work otherwise. It will be relatively slow, but for a few small to medium sized documents it will work fine.
All I did was write some functions that apply regex to content in a corpus. You could also do this with an apply statement instead of a tm_map.
#Read Texts and write them to a data table
data = c("My fake text Disclosure This is just a sentence Conclusion Don't consider it a file.",
"My second fake Disclosure This is just a sentence Conclusion Don't consider it a file.")
# Create a corpus
library(tm)
library(stringr)
corp = VCorpus(VectorSource(data))
#Remove all stopwords and punctuation
corp = tm_map(corp, removeWords, stopwords("english"))
corp= tm_map(corp, removePunctuation)
remove_before_Disclosure <- function(doc.in){
doc.in$content <- str_remove(doc.in$content,".+(?=Disclosure)")
return(doc.in)
}
corp2 <- tm_map(corp,remove_before_Disclosure)
remove_after_Conclusion <- function(doc.in){
doc.in$content <- str_remove(doc.in$content,"(?<=Conclusion).+")
return(doc.in)
}
corp2 <- tm_map(corp2,remove_after_Conclusion)