I am processing a large text file and as output I have a list of words:
['today', ',', 'is', 'cold', 'outside', '2013', '?', 'December', ...]
What I want to achieve next is to transform everything to lowercase, remove all the words that belong to a stopset (commonly used words) and remove punctuation. I can do it by doing 3 iterations:
lower=[word.lower() for word in mywords]
removepunc=[word for word in lower if word not in string.punctuation]
final=[word for word in removepunc if word not in stopset]
I tried to use
[word for word in lower if word not in string.punctuation or word not in stopset]
to achieve what last 2 lines of code are supposed to do but it's not working. Where is my error and is there any faster way to achieve this than to iterate through the list 3 times?