0

I have a dataframe containing 10000 text observations, and I would like to apply a dictionary on values on it, which contains 10 different categories.

I have run the following code:

my_dict <- dictionary(list(
  category1 = Values1$Security,
  category2 = Values1$Conformity,
  category3 = Values1$Tradition,
  category4 = Values1$Benevolence,
  category5 = Values1$Universalism,
  category6 = Values1$`Self-Direction`,
  category7 = Values1$Stimulation,
  category8 = Values1$Hedonism,
  category9 = Values1$Achievement,
  category10 = Values1$Power
))


corp <- corpus(MessageDA1, text_field = 'Text')

toks <- quanteda::tokens(corp)

dfmt <- dfm(toks)

dfmt_dict <- dfm_lookup(dfmt, dictionary=my_dict)

And then I get the following error message:

Error in `set_dfm_featnames<-`(`*tmp*`, value = col_new) : 
ncol(x) == length(value) is not TRUE

How do I fix this?

Here is the code I used, but on a much smaller sample, this works for me, but on the larger data frame I am using it does not

library(quanteda)

testtext <- c("This is sentence 1.", "This is sentence 2.", "This 
is sentence 3.")

testmy_tokens <- tokens(testtext)

testmy_dict <- dictionary(list(category1 = c("This", "sentence"),
                           category2 = c("is", "sentence"),
                           category3 = c("sentence", "1"),
                           category4 = c("This", "sentence"),
                           category5 = c("is", "sentence"),
                           category6 = c("sentence", "2"),
                           category7 = c("This", "sentence"),
                           category8 = c("is", "sentence"),
                           category9 = c("sentence", "3"),
                           category10 = c("This", "sentence")))

testmy_dfm <- dfm(testmy_tokens)

testmy_dfm <- dfm_lookup(testmy_dfm , dictionary = testmy_dict)

testmy_dfm
  • Hi! Welcome to SO! Could you please prepare a minimal reproducible example following [this guide](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? That way we can help you out much more easily. Also, I suspect this is the same issue I faced a few days ago. – Francesco Grossetti Apr 18 '23 at 07:02
  • Thanks! I have tried to make a reproducible example, the issue however is that here it works, and I suspect the issue is to be found in the specific data frame I am using, so I am not sure how much this will help – Anne Sofie Nielsen Apr 18 '23 at 09:58
  • Then I suggest you find where the problem is in your data.frame and extract a couple of examples. One in which you have the issue and the other where everything works as expected. – Francesco Grossetti Apr 18 '23 at 10:52
  • Please provide enough code so others can better understand or reproduce the problem. – Community Apr 18 '23 at 15:16

0 Answers0