0

I am a beginner r user and I would need some help with my project.

I want to build a quanteda corpus for PDF text analysis. I have developed a function that gonna help me rebuild the corpus by putting the tokens back together in the same order after the cleaning process.

`

#Rebuild the corpus by putting the tokens back together in the same order
    corpus.tokens<- function(x, ...){
    quanteda:::build_corpus(
    unlist(lapply(x, paste, collapse = " ")),
    docvars = cbind(quanteda:::make_docvars(length(x), docnames(x), docvars(x)))
  )
}

`

What I am trying to do is try to specify the each corpus toks and restoring them using the corpus quanteda function.

`

    class1_corp<- corpus(class1toks)
    class2_corp<- Corpus(class2toks)
    class3_corp<- corpus(class3toks)
    class4_corp<- corpus(class4toks)
    class5_corp<- corpus(class5toks)

`

When I try to execute the last code I get the following error message: Error in unique && any(duplicated(docname)) : invalid 'x' type in 'x && y'

I am not sure what this message means (I tried to look it up on google and could not find anything) and I do not know what I am doing wrong. Any help would be much appreciated!

  • 1
    You should supply a reproducible, minimal example that produces your problem, for instance with stylised, invented texts, and then we can solve the issue for you. See for instance https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610. – Ken Benoit Dec 06 '22 at 14:37

0 Answers0