I am making a program that makes a word cloud. I want a list of word without punctuations and commonly used words. I removed the punctuation, using the function removepunc
; it works fine. Now I am creating a second function to remove commonly used words (I am not using previous logic since it removes the letter I from the program along with the pronoun I), I am getting the error IndexError: list index out of range
, I converted the file into a list.
CODE:
def removepunc(z):
test_str=z
punc = '''!()-[]{};:'""\,<>./?@#$%^&*_~'''
for ele in test_str:
if ele in punc:
test_str = test_str.replace(ele, "")
return test_str
def removebad(f):
print(type(f))
z=[]
badword2 = ["the", "a", "to", "if", "is", "it", "of", "and", "or", "an", "as", "i", "me",
"my","we", "our", "ours", "you", "your", "yours", "he", "she", "him", "his", "her",
"hers", "its", "they","them","their", "what", "which", "who", "whom", "this", "that",
"am", "are", "was", "were", "be", "been","being","have", "has", "had", "do", "does",
"did", "but", "at", "by", "with", "from", "here", "when", "where","how","all", "any",
"both", "each", "few", "more", "some", "such", "no", "nor", "too", "very", "can",
"will","just"]
for i in range (len(f)-1):
if f[i] in badword2:
x=f.pop(i)
z.append(x)
else:
continue
return f
file=open("openfile.txt")
a=file.read()
a=a.lower()
unqword=removepunc(a)
ab=unqword.split()
print(type(ab))
unqword1=removebad(ab)
print(unqword1)
`
OUTPUT:
C:\Users\Nitin\PycharmProjects\pythonProject1\venv\Scripts\python.exe C:/Users/Nitin/PycharmProjects/pythonProject1/prjt.py
<class 'list'>
<class 'list'>
Traceback (most recent call last):
File "C:/Users/Nitin/PycharmProjects/pythonProject1/prjt.py", line 29, in <module>
unqword1=removebad(ab)
File "C:/Users/Nitin/PycharmProjects/pythonProject1/prjt.py", line 14, in removebad
if f[i] in badword2:
IndexError: list index out of range
Process finished with exit code 1
i have not written logic for wordcloud which i will do later when i get rid of this