i want to create a regex function that takes the codes and set it like a reference dictionary to parse into the corpus and set them into a TDM with their occurrences
corpus<- Corpus(DirSource(path))
dictionary <- regexpr(("") , corp)
regular <- DocumentTermMatrix(corp, control = list(dictionary = dictionary))
any one can help me resolving this problem