I'm having a problem making a vocabulary of words in python. My code goes through every word in a document of about 2.3MB and checks whether or not the word is in the dictionary, if it is not, it appends to the list
The problem is, it is taking way to long (I havent even gotten it to finish yet). How can I solve this?
Code:
words = [("_", "hello"), ("hello", "world"), ("world", "."), (".", "_")] # List of a ton of tuples of words
vocab = []
for w in words:
if not w in vocab:
vocab.append(w)