0

Hoping to get some help in these crazy times. I have a function defined to clean a column of review data. So random sentences of text. When I go to use the function against my train and test data I am getting the following error. I am in need of some guidance on how to fix this?

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

def clean_text(text):
    #lower test
    test = text.lower()
    #tokenize test and remove punctuation
    text = [word.strip(string.punctuation) for word in text.split(" ")]
    #remove words that contain numbers
    text = [word for word in text if not any(c.isdigit() for c in word)]
    #remove empty tokens
    text = [t for t in text if len(t)>0]
    #remove words with only one letter
    text = [t for t in text if len(t)>1]
    #join all
    text = "".join(text)
    return(text)

train, test = train_test_split(df, test_size=0.33, random_state=42)

train['reviews.text'] = train['reviews.text'].apply(lambda x: clean_text(x))
test['reviews.text'] = test['reviews.text'].apply(lambda x: clean_text(x))
Sam Russo
  • 31
  • 1
  • 6

0 Answers0