0

I have a dataframe which contains the below column:

column_name
CUVITRU 8 gram
CUVITRU 1 grams

I want to replace these gram and grams to gm. So I have created a dictionary

dict_ = {'gram':'gm','grams':'gm'}

I am able to replace it but it is converting grams to gms. Below is the column after conversion:

column_name
CUVITRU 8 gm
CUVITRU 1 gms

How can I solve this issue.

Below is my code:

dict_ = {'gram':'gm','grams':'gm'}
for key, value in dict_abbr.items():
    my_string = my_string.replace(key,value)

my_string = ' '.join(unique_list(my_string.split()))
def unique_list(l):
    ulist = []
    [ulist.append(x) for x in l if x not in ulist]
    return ulist
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Are these in a pandas dataframe? Could you provide some code to generate an object similar to the ones you're working with? – Patrick Haugh Mar 08 '19 at 21:27
  • It's because `'gram':'gm'` is listed first in your dictionary. So it replaces the "gram" within "grams" and the result is "gms". If you put `'gram':'gm'` after `'grams':'gm'` it should work fine – Reedinationer Mar 08 '19 at 21:29
  • Yes it is an pandas dataframe – Sharmili Nag Mar 08 '19 at 21:30
  • @Reedinationer I have tried that.. it's still the same – Sharmili Nag Mar 08 '19 at 21:32
  • You don't really care about the dict, just the key/value pairs returned by its `items()` method. Just store a list of tuples in the first place (in the desired order): `d = [("grams", "gm"), ("gram", "gm")]`. – chepner Mar 08 '19 at 21:33
  • @Reedinationer dict keys are hashed, changing order in definition does not change iteration results – danchik Mar 08 '19 at 21:33
  • @danchik It does as of Python 3.7, where a dict is guaranteed to retain the insertion order of its keys. – chepner Mar 08 '19 at 21:34

4 Answers4

1

because it finds 'gram' in 'grams', one way is to instead of string use reg exp for replacement on word boundaries, like (r"\b%s\.... look at the answer usign .sub here for example: search-and-replace-with-whole-word-only-option

danchik
  • 181
  • 10
0

Put the longer string grams before the shorter one gram like this {'grams':'gm','gram':'gm'}, and it will work. Well, I’m using a recent python 3 like 3.7.2 which guarantees that the sequence of retrieving items is the same as that they are created in the dictionary. For earlier Pythons that may happen (and this appears to be the problem) but isn’t guaranteed.

0

You don't actually care about the dict; you care about the key/value pairs produced by its items() method, so just store that in the first place. This lets you specify the order of replacements to try regardless of your Python version.

d = [('grams':'gm'), ('gram':'gm')]
for key, value in d:
    my_string = my_string.replace(key,value)
chepner
  • 497,756
  • 71
  • 530
  • 681
0

You can make replacements in the reverse order of the key lengths instead:

dict_ = {'gram':'gm','grams':'gm'}
for key in sorted(dict_abbr, key=len, reverse=True):
    my_string = my_string.replace(key, dict_[key])
blhsing
  • 91,368
  • 6
  • 71
  • 106