Im trying to create a new column filling with a value ('company') if values in another column match one of the patterns in the regex below:
"INC|INC$|INC$|LTD$|CORP$|CORPORATION$|COMPANY$|LLC$|\*LLC$|\*,INC$|\*,CORP$|\*LTD$|\*CORP$|LEASING|TRANSPORTATION|CONSULTANTS|SERVICES|INCORPORATED"
Here is what i tried:
patterns = [".INC.","INC$", ",INC$","LTD$", "CORP$", "CORPORATION$", "COMPANY$", "LLC$", ".*([a-zA-Z]+)LLC$", ".*([a-zA-Z]+),INC$", ".*([a-zA-Z]+),CORP$", ".*([a-zA-Z]+)LTD$", ".*([a-zA-Z]+)CORP$", "LEASING", "TRANSPORTATION", "CONSULTANTS", "SERVICES", "INCORPORATED"]
patterns = re.compile('|'.join(patterns))
data.loc[data['OwnerName'].str.contains(patterns), 'owner'] = 'company'
It matches and renames some strings but not the others. For instance: xxx,INC is matched but xxx INC is not matched.
Could you please point out what am i doing wrong. Thanks!
The xxx, INC
string should turn into company
if matched. But it does not.