0

I am trying to perform text analytics on the following text file. The code I have written to tokenize this text after importing it is:

my_data <- read.delim("5KjlUO.txt")
library(tokenizers)
library(SnowballC)
tokenize_words(my_data$ACT.I)
tokenize_words(my_data)

I am getting the following error:

Error in check_input(x) : Input must be a character vector of any length or a list of character vectors, each of which has a length of 1.

Can someone help me resolve this issue.

Terru_theTerror
  • 4,918
  • 2
  • 20
  • 39
Anmol
  • 3
  • 2
  • 1
    Please tag your programming language ([tag:r]?), so people who have experience with it will be alerted to your question; and also note that we prefer any text to be actually text, not images. – Amadan Aug 24 '18 at 10:22
  • 2
    Make your text available outside of kaggle. Not everyone has an account / wants to go through the hassle of downloading the file from kaggle. Check this post on how to make a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – phiver Aug 24 '18 at 11:40
  • 1
    What does `class(my_data$ACT.I)` return? Are you sure it's a character vector. If you used `read.delim()` like that it's probably a factor variable. Try tokenize_words(as.character(my_data$ACT.I))`. – MrFlick Aug 24 '18 at 14:58

0 Answers0