I'm currently running this code:
for dicword in dictionary:
for line in train:
for word in line:
if dicword == word:
pWord[i] = pWord[i] + 1
i = i + 1
Where dictionary and pWord are a 1D lists of the same size, and train is a 2D list.
Both dictionary and train are very large, and the code executes slowly.
How can I optimize this particular piece of code and code like this in general?
Edit:
train
is a list containing about 2000 lists, which in turn each contains individual words pulled from a document.
dictionary
was created by pulling each unique word from all of train.
Here is the creation of dictionary:
dictionary = []
for line in train:
for word in line:
if word not in dictionary:
dictionary.append(word)
Edit 2: Sample of the content in each list:
[ ... , 'It', 'ran', 'at', 'the', 'same', 'time', 'as', 'some', 'other', 'programs', 'about', ...]