I'm trying to replace certain alphanumeric values in a column called 'Tags' which look somewhat like for example "PSPK01012L9_microsoft_abc" to "microsoft_abc"
I have tried multiple ways of doing it with regular expression but it's changing all the values in that entire string:
import re
s = dataframe['Tags']
dataframe['Tags'] = re.sub('[A-Za-z0-9_]*_microsoft_abc', 'microsoft_abc', str(s))
dataframe['Tags'] = re.sub('[A-Za-z0-9_]*_google_abc', 'google_abc', str(s))
It would be great if someone could help me out. Newbie in python here:(
desired output in my csv cloumn 'Tags' :
IAM~3rd
IAM~3rd, IAM~KI-000
IAM~1st
IAM~KI-000
IAM~3rd, IAM~KI-057
microsoft_abc
google_abc
Current output with above regex:
dataframe['Tags'].value_counts()
0 0 microsoft_abc google_abc\...\n1 0 microsoft_abc google_abc\...\n2 0 microsoft_abc google_abc\...\n3 0 microsoft_abc google_abc\...\n4 0 microsoft_abc google_abc\...\n ... \n4762 0 microsoft_abc google_abc\...\n4763 0 microsoft_abc google_abc\...\n4764 0 microsoft_abc google_abc\...\n4765 0 microsoft_abc google_abc\...\n4766 0 microsoft_abc google_abc\...\nName: Tags, Length: 4767, dtype: object 4767