I'm working on building a script that counts word frequency and sorts the results in descending order. I'm having trouble implementing something that sorts the results. Here's my code so far:
import re
import string
words = "HELLO how Are yoU i'm doing well thank you."
#remove punctuation
translation = str.maketrans("","", string.punctuation)
stripped = words.translate(translation)
#print(stripped)
##lowercase
words_clean = stripped.lower()
#print(words_clean)
##tokenize
tokens = words_clean.split()
#print(tokens)
##tokenize and word count
def word_count(str):
counts = dict()
for word in tokens:
if word in counts:
counts[word] += 1
else:
counts[word] = 1
return counts
##sort
print(word_count(tokens))
The results that I get are as follows:
{'hello': 1, 'how': 1, 'are': 1, 'you': 2, 'im': 1, 'doing': 1, 'well': 1, 'thank': 1}
The desired results are:
{'you' : 2, 'hello': 1, 'how': 1, 'are': 1, 'im': 1, 'doing': 1, 'well': 1, 'thank': 1}