I have a table with a column of text data. I want to get the frequency counts for each word so I have this code:
cm9_list = (df.cm9.str.split(expand=True).stack().value_counts()).reset_index()
which produces a dataframe like object. It says object type when I use dtypes. I change the column headers:
cm9_list.columns.values[0] = 'word'
cm9_list.columns.values[1] = 'frequency'
and then I want to remove the record in the table in the word column that has the 'nan' value (I do some text processing before this to strip punctuation and stop words etc. so I think these 'nan' values were inserted in null cells during that process.)
I am getting an error when I try to run this code:
cm9_list = cm9_list[cm9_list.columns[0] != 'nan']
That says:
KeyError: True
And I have also tried:
cm9_list = cm9_list[cm9_list['word'] != 'nan']
and get this:
KeyError: 'word'
I have no idea what these errors mean. All I can think of is that it doesn't recognize word as a column name. When I check the column names though, it looks normal:
Index(['word', 'frequency'], dtype='object')
What could be the issue? TIA!!