I have a text preprocessing function like this:
def preprocessing(text):
text = text.lower()
text = "".join([char for char in text if char not in string.punctuation])
words = word_tokenize(text)
words = [word for word in words if word not in stopwords.words('english')]
words = [PorterStemmer().stem(word) for word in words]
return words
And I am going to pass a dataframe in this function like this:
df['reviewText'] = df['reviewText'].apply(lambda x: preprocessing(x))
But the dataframe column has around 10000 reviews sentences, and the code taking too much time to complete. Is there any way to add a 'progress bar' so that I will have some understanding of time.
PS. If you want to try this on your local machine, the data can be found on this site.