5

I'm trying to replace the occurrence of a word with another:

word_list = { "ugh" : "disappointed"}

tmp = ['laughing ugh']

for index, data in enumerate(tmp):
    for key, value in word_list.iteritems():
        if key in data:
            tmp[index]=data.replace(key, word_list[key])

print tmp

Whereas this works... the occurrence of ugh in laughing is also being replaced in the output: ladisappointeding disappointed.

How does one avoid this so that the output is laughing disappointed?

Ian
  • 30,182
  • 19
  • 69
  • 107
user47467
  • 1,045
  • 2
  • 17
  • 34

6 Answers6

5

In that case, you may want to consider to replace word by word.

Example:

word_list = { "ugh" : "disappointed"}
tmp = ['laughing ugh']

for t in tmp:
    words = t.split()
    for i in range(len(words)):
        if words[i] in word_list.keys():
            words[i] = word_list[words[i]]
    newline = " ".join(words)
    print(newline)

Output:

laughing disappointed

Step-by-Step Explanations:

  1. Get every sentence in the tmp list:

    for t in tmp:
    
  2. split the sentence into words:

    words = t.split()
    
  3. check whether any word in words are in the word_list keys. If it does, replace it with its value:

    for i in range(len(words)):
        if words[i] in word_list.keys():
            words[i] = word_list[words[i]]
    
  4. rejoin the replaced words and print the result out:

    newline = " ".join(words)
    print(newline)
    
Ian
  • 30,182
  • 19
  • 69
  • 107
4

You can do this by using a RegEx:

>>> import re
>>> re.sub(r'\bugh\b', 'disappointed', 'laughing ugh')
'laughing disappointed'

The \b stands for a word boundary.

JonnyTieM
  • 177
  • 2
  • 12
2

Use re.sub:

for key, value in word_list.items():
    tmp = re.sub("\\b{}\\b".format(key), value, tmp[index])
2
word_list = { "ugh" : "disappointed", "123" : "lol"}
tmp = ['laughing 123 ugh']

for word in tmp:
    words = word.split()
for i in words[:]:
    if  i in word_list.keys():
    replace_value = word_list.get(i)
    words[words.index(i)] = replace_value
output = " ".join(words)
print output

This code will swap each key of the dict (so the word you want to replace ) with the dict value of that key ( the word you want it to be replaced with) in every case and with multiple values!

Output:
    laughing lol disappointed

Hope that helps!

1

You can use regular expressions:

import re

for index, data in enumerate(tmp):
    for key, value in word_list.iteritems():
        if key in data:
            pattern = '\b' + key + '\b'
            data = re.sub(pattern, value, data)
            tmp[index] = data

Side note: you need data = ... line (to overwrite data variable) otherwise it will work incorrectly when word_list contains multiple entries.

freakish
  • 54,167
  • 9
  • 132
  • 169
1

Fast:

>>> [re.sub(r'\w+', lambda m: word_list.get(m.group(), m.group()), t) 
     for t in tmp]
['laughing disappointed']
>>> 

Very Fast:

>>> [re.sub(r'\b(?:%s)\b' % '|'.join(word_list.keys()), lambda m: word_list.get(m.group(), m.group()), t) 
...  for t in tmp]
['laughing disappointed']
>>> 
kxr
  • 4,841
  • 1
  • 49
  • 32