I try to count the frequency of word occurances in a variable. The variables counts more than 700.000 observations. The output should return a dictionary with the words that occured the most. I used the code below to do this:
d1 = {}
for i in range(len(words)-1):
x=words[i]
c=0
for j in range(i,len(words)):
c=words.count(x)
count=dict({x:c})
if x not in d1.keys():
d1.update(count)
I've runned the code for the first 1000 observations and it worked perfectly. The output is shown below:
[('semantic', 23),
('representations', 11),
('models', 10),
('task', 10),
('data', 9),
('parser', 9),
('language', 8),
('languages', 8),
('paper', 8),
('meaning', 8),
('rules', 8),
('results', 7),
('performance', 7),
('parsing', 7),
('systems', 7),
('neural', 6),
('tasks', 6),
('entailment', 6),
('generic', 6),
('te', 6),
('natural', 5),
('method', 5),
('approaches', 5)]
When I try to run it for 100.000 observations, it keeps running. I've tried it for more than 24 hours and it still doesn't execute. Does anyone have an idea?