I'm using the nltk library in Python; my background is Java. I don't understand the console output for the code I wrote. Why does Python return a strange form despite my initializing variable tokens
as a list.
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
import nltk
def tokenize_sentence(sentence):
tokens=[]
tokens = word_tokenize(sentence)
tokens = (word for word in tokens if word not in \
set(stopwords.words('english')))
return tokens;
a="John is an actor."
print(tokenize_sentence(a))
Output:
<generator object tokenize_sentence.<locals>.<genexpr> at 0x10dc5b1a8>
I see this output as something similar to what Java does when I try to print an object for which toString() method is not defined.