I'm testing the code below.
features = df[['body','review_text']].values
labels = df['num_reviews'].values
processed_features = []
for sentence in range(0, len(features)):
# Remove all the special characters
processed_feature = re.sub(r'\W', ' ', str(features[sentence]))
# remove all single characters
processed_feature= re.sub(r'\s+[a-zA-Z]\s+', ' ', processed_feature)
# Remove single characters from the start
processed_feature = re.sub(r'\^[a-zA-Z]\s+', ' ', processed_feature)
# Substituting multiple spaces with single space
processed_feature = re.sub(r'\s+', ' ', processed_feature, flags=re.I)
# Removing prefixed 'b'
processed_feature = re.sub(r'^b\s+', '', processed_feature)
# Converting to Lowercase
processed_feature = processed_feature.lower()
processed_features.append(processed_feature)
I encounter an error on this line:
processed_feature = re.sub(r'\W', ' ', str(features[sentence]))
The error message is: TypeError: 'str' object is not callable
The code that I am testing is from this link.
https://stackabuse.com/python-for-nlp-sentiment-analysis-with-scikit-learn/
What is the easiest way to fix this? Or, is there an altogether better way to do this kind of text-cleaning exercise? Thanks.