This was originally marked a duplicate by someone but this is in relation to pandas, so different than what it was marked as a duplicate for.
I am trying to use re.sub to remove the first occurrence per pandas cell of a string that matches my list.
I have:
import pandas as pd
import re
df = pd.DataFrame(
{
"ID": [1, 2, 3, 4, 5],
"name": [
"hello kitty hello",
"hello puppy",
"it is an helloexample",
"for stackoverflow",
"hello world",
],
}
)
strings_to_remove = ["hello", "for", "an"]
I want an output like:
df2 = pd.DataFrame(
{
'ID': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
'name': {0: ' kty hello',
1: ' puppy',
2: ' is example',
3: ' stackoverflow',
4: ' world'}}
)
Notice how only the first occurrence of hello is removed from df2 under the 'name' column for each cell.
Looking to use something like re.sub but not sure how to get the code to only remove the first occurrence of 'hello' within each cell. Any ideas?