I am trying standardize my address data which present inside dataframe.
I was only able correctly replace words where in dictionary where one word has only one values. I am not noble to replace when one word has mutliple values .
Input data
data['Postal_Addr']
UNIT 3510 35/F THE CENTRE
HARNEYS THE CENTRE 99 QUEENS STR ROAD CETRAL
OCBC WING HANG BANK UNIT B 5/FLR WING HANG INSURANCE CENTRE BUILDING
M3 CAPITAL PARTNERS 50 LEVEL 3 CENTRAL
GALERIE H LDT 50 CENTRAL 17TH FLOOR
36/FL INFINITUS PLAZA STREET
51 SAI ST STRAT RAOD
Dictionary
Mapping_matrix.set_index('Orginal')['Related_Words'].to_dict()
{'FLOOR ': 'F,FL,FLR',
'ROAD': 'RD, RD.,RAOD',
'STREET': 'ST, STR,STRAT'
'CENTRAL': 'CENTRE, CETRAL'}
Expected Output
UNIT 3510 35/FLOOR THE CENTRAL 99 QUEENS ROAD CENTRAL
HARNEYS THE CENTRAL 99 ROAD CENTRAL
OCBC WING HANG BANK UNIT B 5/FLOOR WING HANG INSURANCE CENTRAL BUILDING
M3 CAPITAL PARTNERS 50 LEVEL 3 CENTRAL
GALERIE LDT 50 CENTRAL 17TH FLOOR
51 SAI STREET STREET ROAD
I tried code mentionedrefrence(How to replace a string using a dictionary containing multiple values for a key in python)
def replace_words(d, col):
d1={r'(?<!\S)'+ k.strip() + r'(?!\S)':k1 for k1, v1 in d.items() for k in v1.split(',')}
df[col] = df[col].replace(d1, regex=True)
return df[col]
df['col'] = replace_words(d, 'col')
But it didnt work.