-1

This seems like it should be straightforward, but it is not, I want to implement string replacement in python, the strings to be replaced can be unigrams or n-grams, but I do not want to replace a string contained within a word.

So for example:

x='hello world'
x.replace('llo','ll)

returns:

'hell world'

but I dont want that to happen.

Splitting the string on whitespace works for inidividual words (unigrams) but I also want to replace n-grams

so:

'this world is a happy place to be'

to be converted to:

'this world is a miserable cesspit to be'

and splitting on whitespace does not work.

Is there an in-built function in Python3 that allows me to do this?

I could do:

if len(new_string.split(' '))>1:
    x.replace(old_string,new_string)
else:
    x_array=x.split(' ')
    x_array=[new_string if y==old_string else y for y in x_array]
    x=' '.join(x_array)
laila
  • 1,009
  • 3
  • 15
  • 27
  • Write a regular expression with word `\b`oundaries and use `re.sub` instead of `str.replace` (see `re.sub(r'\bllo\b', 'll', 'hello world')`)? – jonrsharpe Jul 29 '15 at 09:53
  • neater than my solution, thanks – laila Jul 29 '15 at 09:57
  • possible duplicate of [Replace exact substring in python](http://stackoverflow.com/questions/31697043/replace-exact-substring-in-python) – jonrsharpe Jul 29 '15 at 09:58

1 Answers1

0

you could do this:

import re

re_search = '(?P<pre>[^ ])llo(?P<post>[^ ])'
re_replace = '\g<pre>ll\g<post>'

print(re.sub(re_search, re_replace, 'hello world'))
print(re.sub(re_search, re_replace, 'helloworld'))

output:

hello world
hellworld

note how you need to add pre and post again.

now i see the comments... \b may work nicer.

hiro protagonist
  • 44,693
  • 14
  • 86
  • 111