1

The following code should output stem words, but instead I get a generator object.

from nltk.stem import SnowballStemmer
english_stemmer = SnowballStemmer('english')

words = ["presumably", "presume", "multiply"]
print(stemmer.stem(w) for w in words)

Output: generator object genexpr at 0x11ad5df68

This should be really straightforward - what is going wrong?

Mia
  • 559
  • 4
  • 9
  • 21
  • 6
    Because you created a generator using the generator expression syntax then printed it: `print(stemmer.stem(w) for w in words)` What were you *expecting* to happen? Where did you learn to use that syntax? It sounds like you just want a for-loop: `for w in words: print(stemmer.stem(w))` *not* a generator expression. Although a generator expression looks like shorthand for a for-loop, it is not. – juanpa.arrivillaga Nov 27 '17 at 17:01
  • 2
    Or in other terms: use `print([stemmer.stem(w) for w in words])` – timgeb Nov 27 '17 at 17:02
  • Or `print(*(stemmer.stem(w) for w in words))` – tobias_k Nov 27 '17 at 17:03
  • Haha sorry, where did I learn this syntax? My professor gave it as an example during lecture...but when I tried it at home, it did not work. :( But thank you for your suggestions, it now works! – Mia Nov 27 '17 at 17:06
  • Generator expressions (and generators) are a useful construct in general, so you should read about them anyway so that you can use them when you need them :) – Andras Deak -- Слава Україні Nov 27 '17 at 17:07
  • @Mia well, hopefully the target duplicate answers your question, just ask me something if it still isn't clear, but basically, you just want a for-loop. – juanpa.arrivillaga Nov 27 '17 at 17:08
  • @juanpa: I am now trying to apply it to a column within my pandas dataframe, but again I get a lot of generator objects. Could you point me in the right direction? example['col1']=example['col1'].apply(lambda x : (english_stemmer.stem(y) for y in x)). The output is like this (when printing example['col1'] 1 .. ... 2 .. ... 3 .. ... – Mia Nov 27 '17 at 17:25
  • Again, *why are you doing this?* **Dont try to use it unless you know what a generator expression is**. If you don't want generator objects, then *don't use generator expressions*. – juanpa.arrivillaga Nov 27 '17 at 17:27
  • How else would I stem all the words for each row within a pandas dataframe? – Mia Nov 27 '17 at 17:29
  • @Mia ... have you looked at the duplicate answer? Have you tried using a for-loop? Or just the equivalent list-comprehension? (although, I would stick to simpler constructs until you have a handle on the basics of the language, learn to walk before you run) – juanpa.arrivillaga Nov 27 '17 at 17:30
  • @juanpa: Yes, I looked at the answer, and yes, I tried a for-loop. I don't get an error but my dataframe does not seem to get populated with the newly stemmed words. :( Shouldn't this do the trick? for row in example['col1']: for token in row: token = english_stemmer.stem(token) – Mia Nov 27 '17 at 17:37
  • No, absolutely not. That's just assigning the result of your stemmer to a variable `token`, overwriting it each time in a loop. You could instead put your for-loop in a function, `def stem_words(words): ...` and use a for-loop to populate a list, then return that list. Then, apply your function. `df['col1'].apply(stem_words)` – juanpa.arrivillaga Nov 27 '17 at 17:42
  • Thank you for your patience! I solved the issue. :) – Mia Nov 27 '17 at 18:34

0 Answers0