Replace only matching words

Question

Have a string,

'ANNA BOUGHT AN APPLE AND A BANANA'

and want to replace 'AN' and get

'ANNA BOUGHT X APPLE AND A BANANA'

but simple code:

text.replace('AN', 'X')

returns:

XNA BOUGHT X APPLE XD A BXXA

How to make it work?

Does this answer your question? [Do regular expressions from the re module support word boundaries (\b)?](https://stackoverflow.com/questions/3995034/do-regular-expressions-from-the-re-module-support-word-boundaries-b) — sushanth, Sep 04 '20 at 11:16

maciejwww · Answer 1 · 2020-09-04T11:37:58.133

1

This code works for every case (begging/middle/end of the string, with or without punctuation marks):

import re

your_string = 'AN ANNA BOUGHT AN APPLE AND A BANANA AN'
replaced_strig = re.sub(r'\bAN\b', 'X', your_string)

edited Sep 04 '20 at 11:37

answered Sep 04 '20 at 11:31

maciejwww

1,067
1
13
26

score 0 · Answer 2 · answered Sep 04 '20 at 11:16

0

If you want to search for the word AN, you should use text.replace(' AN ', ' X ') with the spaces. That way you'll be replacing only the word and avoiding other occurrences

answered Sep 04 '20 at 11:16

JPery

69
6

That will fail if the string is for example `'AN APPLE'` with no space at the start. – alani Sep 04 '20 at 11:22
1

In this case it works but if 'AN' was at the beggining or the end of the string, it wouldn't be replaced. – maciejwww Sep 04 '20 at 11:25

score 0 · Answer 3 · 2020-09-04T11:36:21.230

0

Let string = ANNA BOUGHT AN APPLE AND A BANANA

Then myList = string.split(' ')

It will return myList = ['ANNA', 'BOUGHT', 'AN', 'APPLE', 'AND', 'A', 'BANANA']

Then you can do the following

myList[myList.index('AN')] = 'X'

In case multiple 'AN' is present, we can do the following

for i in range(len(myList)):

    if myList[i] == 'AN':

        myList[i] =  'X'

edited Sep 04 '20 at 11:36

answered Sep 04 '20 at 11:18

What if there is not exactly one occurrence of `'AN'` (zero or more than one)? – alani Sep 04 '20 at 11:22
In that case, we can traverse over the whole list, replacing each `AN` with `X` – Sep 04 '20 at 11:32

score 0 · Answer 4 · answered Sep 04 '20 at 11:26

You can use regular expressions - note the use of \b for word boundaries:

import re
line = 'ANNA BOUGHT AN APPLE AND A BANANA'
print(re.sub(r'\bAN\b', 'X', line))

or a solution without regular expressions (does not preserve the exact amount of whitespace between words, and may not be exactly equivalent if there is punctuation also):

line = 'ANNA BOUGHT AN APPLE AND A BANANA'

print(' '.join('X' if word == 'AN' else word
               for word in line.split()))

score 0 · Answer 5 · answered Sep 04 '20 at 11:39

regex is the best way to have such manipulation and even more complex ones, it is a bit intimidating to learn, but once you are done with it it gets really easy

import re
text = 'ANNA BOUGHT AN APPLE AND A BANANA'
pattern = r'(AN)'
new = re.sub(pattern,'X',text)
print(new)

Lutz · Answer 6 · 2020-09-16T17:03:03.853

0

regex is the way - with lookahead and lookbehind

import re
line = 'AN ANNA BOUGHT AN APPLE AND A BANANA AN. AN'
pattern='((?<=^)|(?<=\W))AN(?=\W|$)'
p = re.compile(pattern)
print(p.sub('X', line))

input: AN ANNA BOUGHT AN APPLE AND A BANANA AN. AN
result: X ANNA BOUGHT X APPLE AND A BANANA X. X

edited Sep 16 '20 at 17:03

answered Sep 16 '20 at 15:44

Lutz

31
5

Replace only matching words

6 Answers6