0

If I had a body of text and wanted to replace "ion" or "s" with nothing but keep the rest of the word (so if the word is reflection it should output reflect), how would I go about that? I have tried:

new_llw = re.sub(r'[a-z]+ion', "", llw)
print(new_llw)

which replaces the whole word, and I tried

if re.search(r'[a-z]+ion', "", llw) is True:
    re.sub('ion', '', llw)

print(llw)

which gives me an error:

TypeError: unsupported operand type(s) for &: 'str' and 'int'

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
dmoses
  • 37
  • 1
  • 4
  • The way you've explained this, it doesn't sound like you even need RegEx. You could just use [`string.replace()`](https://www.w3schools.com/python/ref_string_replace.asp). Does it need to be at the end of the word? If so you should specify. – Jesse Sep 14 '22 at 23:35
  • Yeah sadly in my class we are going over how to use regex or else there would be a lot more simpler ways to do this. It doesn't need to be at the end of the word, but it should have more than one 1 letter in front of it. – dmoses Sep 15 '22 at 05:14
  • Even then, regular expressions don't *need* to contain any regex groups or escape sequences. If you're just looking to replace a string with another string, it can be used like a normal replace function (as long as you escape characters regex wouldn't treat literally). That being said, there needing to be one or more character before the text does change things. Details like this are extremely important when writing a regular expression. In the future when asking regex questions, please make sure details like this are part of the question. – Jesse Sep 15 '22 at 16:12

2 Answers2

1

For the ion replacement, you may use a positive lookbehind:

inp = "reflection"
output = re.sub(r'(?<=\w)ion\b', '', inp)
print(output)  # reflect
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
1

The TypeError: unsupported operand type(s) for &: 'str' and 'int' error is due to the fact you are using re.search(r'[a-z]+ion', "", llw) like re.sub. The second argument to re.search is the input string, which is empty and the third argument is the flags, that are set with specific regex options (like re.A or re.I) that may present a bitwise mask (re.A | re.I).

Now, if you need to match an ion as a suffix in a word, you can use

new_llw = re.sub(r'\Bion\b', '', llw)

Here, \B matches a location that is immediately preceded with a word char (a letter, digit or connector punctuation, like _), then ion matches ion and \b matches a location that is either at the end of string or immediately followed with a non-word char.

To also match an s suffix:

new_llw = re.sub(r'\B(?:ion|s)\b', '', llw)

The (?:...) is a non-capturing group.

See the regex demo.

Variations

If you consider words as letter sequences only, you can use

new_llw = re.sub(r'(?<=[a-zA-Z])(?:ion|s)\b', '', llw) # ASCII only version
new_llw = re.sub(r'(?<=[^\W\d_])(?:ion|s)\b', '', llw) # Any Unicode letters supported

Here, (?<=[a-zA-Z]) matches a location that is immediately preceded with an ASCII letter.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563