I have to create a program that reads in lines of code until a single "." is entered, I have to remove punctuation, change all to lower case, remove stopwords and suffixes. I've manged all this except being able to remove suffixes, I've tried .strip as you can see but it will only accept one argument and doesnt actually removed suffixes from the list elements. Any advice/pointers/help? Thanks
stopWords = [ "a", "i", "it", "am", "at", "on", "in", "to", "too", "very", \
"of", "from", "here", "even", "the", "but", "and", "is", "my", \
"them", "then", "this", "that", "than", "though", "so", "are" ]
noStemWords = [ "feed", "sages", "yearling", "mass", "make", "sly", "ring" ]
# -------- Replace with your code - e.g. delete line, add your code here ------------
Text = raw_input("Indexer: Type in lines, that finish with a . at start of line only: ").lower()
while Text != ".":
LineNo = 0
x=0
y=0
i= 0
#creates new string, cycles through strint Text and removes puctutaiton
PuncRemover = ""
for c in Text:
if c in ".,:;!?&'":
c=""
PuncRemover += c
SplitWords = PuncRemover.split()
#loops through SplitWords list, removes value at x if found in StopWords list
while x < len(SplitWords)-1:
if SplitWords[x] in stopWords:
del SplitWords[x]
else:
x=x+1
while y < len(SplitWords)-1:
if SplitWords[y] in noStemWords:
y=y+1
else:
SplitWords[y].strip("ed")
y=y+1
Text = raw_input().lower()
print "lines with stopwords removed:" + str(SplitWords)
print Text
print LineNo
print x
print y
print PuncRemover