I have a huge amount of sentences (just a bit over 100,000). Each one contains on average 10 words. I am trying to put them together into one big list so I can us Counter
from the collections
library to show me the frequency each word occurs. What I'm doing currently is this:
from collections import Counter
words = []
for sentence in sentenceList:
words = words + sentence.split()
counts = Counter(words)
I was wondering if there is a way to do the same thing more efficiently. I've been waiting almost an hour now for this code to finish executing. I would think the concatenating is what is making this take so long since if I replace the line words = words + sentence.split()
with print(sentence.split())
it finishes executing in seconds. Any help would be much appreciated.