I have a pandas dataframe that looks like this:
COL
hi A/P_90890 how A/P_True A/P_/93290 are AP_wueiwo A/P_|iwoeu you A/P_?9028k ?
...
Im fine, what A/P_49 A/P_0.0309 about you?
The expected result should be:
COL
hi how are you?
...
Im fine, what about you?
How can I remove efficiently from a column and for the full pandas dataframe all the strings that have A/P_
?
I tried with this regular expression:
A/P_(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+
However, I do not know if there's a more simpler or robust way of removing all those substrings from my dataframe. How can I remove all the strings that have A/P_
at the beginning?
UPDATE
I tried:
df_sess['COL'] = df_sess['COL'].str.replace(r'A/P(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', '')
And it works, however I would like to know if there's a more robust way of doing this. Possibily with a regular expression.