I'm trying to search in a dataframe about certain words listed in dictionary values if any exist it will replaced with the key of values.
units_dic= {'grams':['g','Grams'],
'kg' :['kilogram','kilograms']}
the problem is some units abbreviations are letters so it will replace all letters also, I want to do the replacement only if it preceded by a number to make sure it's a unit.
Dataframe
Id | test
---------
1 |'A small paperclip has a mass of about 111 g'
2 |'1 kilogram =1000 g'
3 |'g is the 7th letter in the ISO basic Latin alphabet'
Replacement Loop
x = df.copy()
for k in units_dic:
for i in range(len(x['test'])):
for w in units_dic[k]:
x['test'][i] = str(x['test'][i]).replace(str(w), str(k))
The Output
Id | test
---------
1 |'A small paperclip has a mass of about 111 grams'
2 |'1 kg =1000 grams'
3 |'grams is the 7th letter in the ISO basic Latin alphabet'