1

How to remove a substring from the following string 'AA.01.001 Hello' with regular expression using Python.

I tried the following but, not worked.

array= ['AB.01.001 Hello','BA.10.004','CD.10.015 Good bye']
 regex = re.compile(r'[A-Z]{2,3}\\.[0-9]{2}\\.[0-9]{3}')
filtered = filter(lambda i: not regex.search(i), array)

Edit:

Exprected output : [`Hello`,'Good bye']
Vineesh TP
  • 7,755
  • 12
  • 66
  • 130
  • I think the pattern works right? Just add `\s*` after it to account for the whitespaces and use a single backslash. https://regex101.com/r/Jkht46/1 See Python demo https://ideone.com/rZfGLP – The fourth bird Jul 07 '19 at 13:31
  • can you add some more examples to it, will the string to replace is always in this format `'AA.01.001 ` – Code Maniac Jul 07 '19 at 13:31
  • I think the issue is just the double backslashes. You are using a raw string. So you only need one backslash. – Neil Jul 07 '19 at 13:36

1 Answers1

1

You may use re.sub:

import re
array= ['AB.01.001 Hello','BA.10.004','CD.10.015 Good bye']
regex = re.compile(r'[A-Z]{2,3}\.[0-9]{2}\.[0-9]{3}\s*')
filtered = filter(None, map(lambda i: regex.sub('', i), array))
print(list(filtered))
# => ['Hello', 'Good bye']

See the Python demo.

Note I used only one \ to escape the literal dots (as you are using a raw string literal to define the regex) and added \s* to also remove any 0 or more whitespaces after your pattern.

Details

  • map(lambda i: regex.sub('', i), array) - iterates over the array items and removes the matches with re.sub
  • filter(None, ...) removes the empty items resulting from the substitution when the pattern matches the whole string.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563