2

So here is the sample list:
[work,worked,working,play,works,lotus]
I want to slice the -ed,-ing,-s form of work,the result should be like this:
[work,play,lotus]
So,how can i achieve that with pure python code since the NLTK approach seemed to be inaccurate?

jackie2jet
  • 39
  • 1
  • 8
  • It's called stemming, and you need a language processing library to do it, not pure python. Please show the NLTK code – OneCricketeer Dec 04 '17 at 12:56
  • Which library you have used in nltk ? You can use PorterStemmer() , It will convert the word to its root form. – Sumit S Chawla Dec 04 '17 at 12:58
  • see here: https://stackoverflow.com/questions/24647400/what-is-the-best-stemming-method-in-python – Ora Dec 04 '17 at 13:01

3 Answers3

1

You can use the below code:

Code:

from nltk.stem import PorterStemmer
stemmer = PorterStemmer()

List = ['work','worked','working','play']
List = [stemmer.stem(token) for token in List]
List1=[]
for token in List:
    if token not in List1:
        List1.append(token)

Output:

['work', 'play']
Sumit S Chawla
  • 3,180
  • 1
  • 14
  • 33
1

In python, you can use filter to remove values which ends with ing or ed.

your_list = ['work', 'worked', 'working', 'play']
print filter(lambda i: not i.endswith(('ing', 'ed')), your_list)

it returns a list.

['work', 'play']
Manjunath
  • 150
  • 1
  • 6
0

You can simply do this:

List = ['work','worked','working','play']

[item for item in List if not item.endswith(("ed", "ing"))]
Akash Wankhede
  • 618
  • 6
  • 15