I'm trying to replace a column in a DataFrame with preprocessed text data.
I have imported an Excel file as pandas dataframe.
df = pd.read_excel (*file path*)
This file consists of x rows of documents and 12 columns.
I extracted the column 'Text' for NLP.
text_article = (df['Text'])
I have preprocessed this column (removal of digits, stopwords, tokenization, lemmatization etc.) Resulting in the following variable: text_article['final']
I now want to replace the column (df['Text']) with text_article['final'], but don't know how.
When I export the dataframe, I get the original column 'Text'
df.to_excel('*name*.xlsx', index=False)
I've tried the following code to replace the column or add the column, but it doesn't seem to work.
df.insert(text_article['final'])
and
text_article['final'] = df['Text']
I'm relatively new to Python, so I hope I've clearly formulated my question. Thanks in advance.