2

Given a string:

s = "abc, Abc, aBc, abc-def, abca"

and a word:

w = "abc"

I want to modify s in the following way:

!+abc+!, !+Abc+!, !+aBc+!, abc-def, abca

In other words, I want to replace any occurrence of abc no matter of lowercase or uppercase letters in it, by itself preceded by !+ and followed by +!.

I already know that my question is pretty similar to this question: Match string in python regardless of upper and lower case differences

Though, it remains slightly different.

At the moment my solution is pretty dirty and does not work properly:

s = "abc, Abc, aBc, abc-def, abca"
w = "abc"
if w in s:
    s = s.replace(w, "!+"+w+"+!")
if w.title() in s:
    s = s.replace(w.title(), "!+"+w.title()+"+!")
Community
  • 1
  • 1
sono
  • 266
  • 2
  • 7
  • 18

3 Answers3

4

It's pretty straight forward even without a regex.

>>> ', '.join('!+{}+!'.format(x) if x.lower()==w else x for x in s.split(', '))
'!+abc+!, !+Abc+!, !+aBc+!, abc-def, abca'

edit: using a list comprehension instead of a generator comprehension is faster, so

', '.join(['!+{}+!'.format(x) if x.lower()==w else x for x in s.split(', ')])

should be preferred. Read this question and the answer by Raymond Hettinger in particular for the reason.

Community
  • 1
  • 1
timgeb
  • 76,762
  • 20
  • 123
  • 145
  • What if s is a long sentence, where words are not simply recognizable by splitting the sentence by `s.split(", ")`? – sono Apr 22 '17 at 19:50
  • @Ale then we have a different question.Depending on the complexity of the task, either a regex, multiple regex or a full parser can be considered. – timgeb Apr 23 '17 at 09:14
2

Use a regular expression:

In [16]: re.sub(r'(?: |^)(abc),',r'!+\1+!,', s, flags=re.I)
Out[16]: '!+abc+!,!+Abc+!,!+aBc+!, abc-def, abca'

The patter (?: |^)(abc), will match every abc that proceeds by a space or the start of the string (^) and followed by a comma and replaces it with the first captured group surrounded with your expected characters. Note that :? in the first group makes it a non-captured group so the \1 will refer to abc. Also we are using re.I flag which is the ignore case flag.

If you also want to keep the spaces just use a captured-group for the first group:

In [19]: re.sub(r'( |^)(abc),',r'\1!+\2+!,', s, flags=re.I)
Out[19]: '!+abc+!, !+Abc+!, !+aBc+!, abc-def, abca'

Also note that if you want to pass multiple regex as the replacing pattern you can compile the first regex using re.compile() and pass the other patterns within a loop:

my_regex = re.compile(r'( |^)(abc),')
my_new_result = [my_regex.sub(pattern, s, flags=re.I) for pattern in list_of_patterns]

As a more flexible way to deal with re.sub you can also pass a function as the replacer, and so more operations on your captured strings. For example if you want to lower case the matched strings:

s = "abc, Abc, aBc (abc) abc abc <abc> abc-def, abca"

In [31]: re.sub(r'(^|\W)(abc)($|\W)', lambda x: '{}!+{}+!{}'.format(*x.groups()).lower() if x.group(3) != '-' else x.group(0), s, flags=re.I)
Out[31]: '!+abc+!, !+abc+!, !+abc+! (!+abc+!) !+abc+! abc <!+abc+!> abc-def, abca'
Mazdak
  • 105,000
  • 18
  • 159
  • 188
  • Is it possible to put a variable containing the pattern inside the `re.sub` function instead of the pattern itself? I have to call this procedure several times with patterns that I do not know in advance. – sono Apr 22 '17 at 20:17
  • @Ale Of course. You can just pass the variable instead of the patter, but remember that your variable should be a string proceeds by `r`. You may want to do this in a list comprehension or any kind of loop. – Mazdak Apr 23 '17 at 04:53
  • Thanks, a lot, but it seems to me to be too much specific. I mean, what happens if I have `s = "abc, Abc, aBc (abc) abc abc abc-def, abca"`? I want the output to be: `!+abc+!, !+abc+!, !+abc+! (!+abc+!) !+abc+! !+abc+! <!+abc+!> abc-def, abca`. I tried with `s = re.sub(r"\b"+w+r"\b", "!+"+w+"+!", s, flags=re.I)`, where `w="abc"`. It outputs `!+abc+!, !+abc+!, !+abc+! (!+abc+!) !+abc+! !+abc+! <!+abc+!> !+abc+!-def, abca` which has 2 errors: 1) It outputs any occurrence in lowercase 2) it matches "abc" in "abc-def". – sono Apr 23 '17 at 08:10
  • Where did you define ´re´? I got this error: Traceback (most recent call last): File "", line 1, in NameError: name 're' is not defined – gonzalez.ivan90 May 03 '18 at 04:29
  • @gonzalez.ivan90 You need to import it. `import re`. – Mazdak May 03 '18 at 08:44
1

You can use sub as shown below for the replacement. use re.escape() : if there is any special character in the string to be escaped. else, it will work with out that also

re.sub(re.escape(value_want_to_replace),'value_to_be_replaced_with', flags=re.IGNORECASE)

Example: Replacing all Null/NULL/nulL with arpan

   expected_result = "We are doing ignore case replacement for Null, NULL, nulL and space, SPACE, sPace"
   __expected_result = re.sub(re.escape("null"), 'arpan', expected_result, flags=re.IGNORECASE)
    print(__expected_result)

Result:

We are doing ignore case replacement for arpan, arpan, arpan and space, SPACE, sPace

Arpan Saini
  • 4,623
  • 1
  • 42
  • 50