0

I want to use regular expressions to replace things at the start/end of all words in a file. Here are some example cases:

  1. words ending in "ing" get changed to end in "gni": clearing = cleargni
  2. words starting with "sub" get changed to start with "bus": subtract = bustract

How can I isolate these words in a list and apply the example changes? All words are lowercase.

Scherf
  • 1,527
  • 4
  • 16
  • 22
  • Should "Subtract" change to "Bustract"? – zondo May 07 '16 at 22:17
  • Oh its all to be case insensitive, ill add that. – Scherf May 07 '16 at 22:18
  • Welcome to Stack Overflow. Check the Stack Overflow's [help on asking questions](http://stackoverflow.com/help/asking) first, please. Focus on [What topics can I ask about here](http://stackoverflow.com/help/on-topic), [What types of questions should I avoid asking?](http://stackoverflow.com/help/dont-ask), [How to ask a good question](http://stackoverflow.com/help/how-to-ask), [How to create a Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) and [Stack Overflow question checklist](http://meta.stackoverflow.com/questions/260648/stack-overflow-question-checklist). – David Ferenczy Rogožan May 07 '16 at 23:07

2 Answers2

2

Use \b to make sure something is at the beginning or end of a word:

import re

sentence = "..."
converted = re.sub(r'ing\b', 'gni', re.sub(r'\bsub', 'bus', sentence))
zondo
  • 19,901
  • 8
  • 44
  • 83
  • What is the lowercase r at the start of the re.sub? – Scherf May 07 '16 at 22:26
  • 1
    @Scherf: It means that it is a [raw string](https://stackoverflow.com/questions/2081640/what-exactly-do-u-and-r-string-flags-do-in-python-and-what-are-raw-string-l). – zondo May 07 '16 at 22:27
1
import re

strings = ['clearing',
           'subtract']

for i, string in enumerate(strings):
    if re.match(pattern='.*ing$', string=string):
        strings[i] = re.sub(pattern='ing$', repl='gni', string=string)
    if re.match(pattern='^sub.*', string=string):
        strings[i] = re.sub(pattern='^sub', repl='bus', string=string)
print(strings)
Daniel
  • 542
  • 1
  • 4
  • 19
  • Why do you introduce a match object? Just do the substitution. – zondo May 07 '16 at 22:25
  • If the match is not there on the first iteration the 'ing' on clearing will be replaced with 'gni'. Then clearing will not match the second pattern, and sub returns the string it was passed so clearing is put back into the list. Run it to try it out. – Daniel May 07 '16 at 22:31
  • In that case, change `strings[i] = ...` to `string = strings[i] = ...` – zondo May 07 '16 at 22:33
  • This works too though. I had not seen this before. re.sub(r'ing\b', 'gni', re.sub(r'\bsub', 'bus', sentence)) – Daniel May 07 '16 at 22:39