0

I have a column containing the first word from each text and am trying to check if it is in the nltk word list. My code is:

from nltk.corpus import words
wordlist = words.words()
reviews['test'] = reviews.apply(lambda x: True if x['FIRST_WORD'] in wordlist else False )

I'm getting an error:

KeyError: 'FIRST_WORD'

Not sure why because that is definitely the name of the column in my data set. Have I set up the lambda function wrong?

user3242036
  • 645
  • 1
  • 7
  • 16

1 Answers1

1

The syntax is incorrect. The correct syntax would be:

>>> df
  FIRST_WORD SECOND_WORD
0      hello        blah
1         hi        blah
2        xyz        blah

>>> df.apply(lambda x: True if x['FIRST_WORD'] in wordlist else False, axis=1)
0     True
1     True
2    False
dtype: bool

>>> # OR

>>> df['FIRST_WORD'].apply(lambda x : True if x in wordlist else False)
0     True
1     True
2    False
Name: FIRST_WORD, dtype: bool

That said,

  1. You may want to check How can I use the apply() function for a single column? and pandas DataFrame, how to apply function to a specific column? for the difference between map and apply for operations on a single column.
  2. Below is another way to achieve the desired result using isin.
>>> df['FIRST_WORD'].isin(wordlist)
0     True
1     True
2    False
Name: FIRST_WORD, dtype: bool
Nikolaos Chatzis
  • 1,947
  • 2
  • 8
  • 17