I want to extract nouns from dataframe. I do as below
import pandas as pd
import nltk
from nltk.tag import pos_tag
df = pd.DataFrame({'pos': ['noun', 'Alice', 'good', 'well', 'city']})
noun=[]
for index, row in df.iterrows():
noun.append([word for word,pos in pos_tag(row) if pos == 'NN'])
df['noun'] = noun
and i get df['noun']
0 [noun]
1 [Alice]
2 []
3 []
4 [city]
I use regex
df['noun'].replace('[^a-zA-Z0-9]', '', regex = True)
and again
0 [noun]
1 [Alice]
2 []
3 []
4 [city]
Name: noun, dtype: object
what's wrong?