Hello evreyone i have a dataset and i want to apply a fuction that lowercase the ingredients, remove the pronctuation and the stopwords to make after some plots etc.
The ingredients
is in a list in the dataset and when i tried to apply a function i get error.
Also can anyone help me how to achieve to have all this things in a function and continue to have the same form with the processed data in my dataset?
train_dataset
id cuisine ingredients
0 10259 greek [romaine lettuce, black olives, grape tomatoes...
1 25693 southern_us [plain flour, ground pepper, salt, tomatoes, g...
2 20130 filipino [eggs, pepper, salt, mayonaise, cooking oil, g...
3 22213 indian [water, vegetable oil, wheat, salt]
4 13162 indian [black pepper, shallots, cornflour, cayenne pe...
... ... ... ...
39769 29109 irish [light brown sugar, granulated sugar, butter, ...
39770 11462 italian [KRAFT Zesty Italian Dressing, purple onion, b...
39771 2238 irish [eggs, citrus fruit, raisins, sourdough starte...
39772 41882 chinese [boneless chicken skinless thigh, minced garli...
39773 2362 mexican [green chile, jalapeno chilies, onions, ground..
i wrote this function
def preprocess(text):
return str(text.lower())
train_dataset["lowercase"]= train_dataset["ingredients"].apply(preprocess)
and i get this error
p