Made a function to count 20 most common words in a book that I downloaded as a plain text format. The python textbook I am going off of said to use the import string
and then the replace
or the translate
method to remove any punctuation, but when I print out the lines after the replace step, all the lines still have punctuation in it. I tried moving around the line = line.strip()
and the line = line.replace(string.punctuation,'')
step, but that did not work. I have never used replace so I may be using it wrong for all I know. Rest of my program works, just that step is frustrating me.
import string
def function():
infile = open('gutbook.txt','r',encoding='utf-8')
count = dict()
list2 = list()
for line in infile:
line = line.strip()
line = line.replace(string.punctuation,'')
line = line.lower().split()
if line== []:
continue
for i in line:
count[i] = count.get(i,0) + 1
for key,value in count.items():
newtuple = (value,key)
list2.append(newtuple)
list3 = sorted(list2,reverse = True)
print(list3[:20])
function()