I would like to extract the word like this:
a dog ==> dog
some dogs ==> dog
dogmatic ==> None
There is a similar link: Extract substring from text in a pandas DataFrame as new column
But it does not fulfill my requirements.
From this dataframe:
df = pd.DataFrame({'comment': ['A likes cat', 'B likes Cats',
'C likes cats.', 'D likes cat!',
'E is educated',
'F is catholic',
'G likes cat, he has three of them.',
'H likes cat; he has four of them.',
'I adore !!cats!!',
'x is dogmatic',
'x is eating hotdogs.',
'x likes dogs, he has three of them.',
'x likes dogs; he has four of them.',
'x adores **dogs**'
]})
How to get correct output?
comment label EXTRACT
0 A likes cat cat cat
1 B likes Cats cat cat
2 C likes cats. cat cat
3 D likes cat! cat cat
4 E is educated None cat
5 F is catholic None cat
6 G likes cat, he has three of them. cat cat
7 H likes cat; he has four of them. cat cat
8 I adore !!cats!! cat cat
9 x is dogmatic None dog
10 x is eating hotdogs. None dog
11 x likes dogs, he has three of them. dog dog
12 x likes dogs; he has four of them. dog dog
13 x adores **dogs** dog dog