From file_test.txt I need to count how many times each word appeared int the file using nltk.FreqDist() function. When I count word frequency I need to see if that word is in pos_dict.txt, and if it is, then multiply the number of word freq with the number standing by the same word in pos_dict.txt.
file_test.txt
looks like this:
abandon, abandon, calm, clear
pos_dict.txt
looks like this for these words:
"abandon":2,"calm":2,"clear":1,...
My code is:
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re
import nltk
f_input_pos=open('file_test.txt','r').read()
def features_pos(dat):
tokens = nltk.word_tokenize(dat)
fdist=nltk.FreqDist(tokens)
f_pos_dict=open('pos_dict.txt','r').read()
f=f_pos_dict.split(',')
for part in f:
b=part.split(':')
c=b[-1] #to catch the number
T2 = eval(str(c).replace("'","")) # convert number from string to int
for word in fdist:
if word in f_pos_dict:
d=fdist[word]
print(word,'->',d*T2)
features_pos(f_input_pos)
So my output needs to be like this:
abandon->4
calm->2
clear->1
But my output is duplicating all outputs and obvioulsy multiplying wrong. I'm a bit stuck and I don't know where is the error, probably I'm using for loops wrong. If somebody can help, I would appreciate it :)