So here is the sample list:
[work,worked,working,play,works,lotus]
I want to slice the -ed,-ing,-s form of work,the result should be like this:
[work,play,lotus]
So,how can i achieve that with pure python code since the NLTK approach seemed to be inaccurate?
Asked
Active
Viewed 1,478 times
2

jackie2jet
- 39
- 1
- 8
-
It's called stemming, and you need a language processing library to do it, not pure python. Please show the NLTK code – OneCricketeer Dec 04 '17 at 12:56
-
Which library you have used in nltk ? You can use PorterStemmer() , It will convert the word to its root form. – Sumit S Chawla Dec 04 '17 at 12:58
-
see here: https://stackoverflow.com/questions/24647400/what-is-the-best-stemming-method-in-python – Ora Dec 04 '17 at 13:01
3 Answers
1
You can use the below code:
Code:
from nltk.stem import PorterStemmer
stemmer = PorterStemmer()
List = ['work','worked','working','play']
List = [stemmer.stem(token) for token in List]
List1=[]
for token in List:
if token not in List1:
List1.append(token)
Output:
['work', 'play']

Sumit S Chawla
- 3,180
- 1
- 14
- 33
-
It's so inaccurate,how can i simply seperate working from work and slice out the "ing" without slicing out unwanted words like "sing". – jackie2jet Dec 20 '17 at 10:37
-
-
-
1
In python, you can use filter to remove values which ends with ing
or ed
.
your_list = ['work', 'worked', 'working', 'play']
print filter(lambda i: not i.endswith(('ing', 'ed')), your_list)
it returns a list.
['work', 'play']

Manjunath
- 150
- 1
- 6
0
You can simply do this:
List = ['work','worked','working','play']
[item for item in List if not item.endswith(("ed", "ing"))]

Akash Wankhede
- 618
- 6
- 15
-
It will slice the unwanted words like "bring"or "sing" out of the list – jackie2jet Dec 06 '17 at 12:07