1

I want to remove numbers from a string which is just placed after a word without any space. eg:

'Senku Ishigami is charecter from a manga series98 onging since 2017.'

should be:

'Senku Ishigami is charecter from a manga series onging since 2017.'

I could remove detect the numbers with a regex '[a-z]+[0-9]+', But when I can't understand how can I remove it. I tried to remove it by just writing '[a-z]', as I thought it would work, but it is just printing '[a-z]' as a string .

Here is the code:

import re

text ='Senku Ishigami is charecter from a manga series98 onging since 2017.'
text = re.sub(r'[a-z]+[0-9]+', '[a-z]', text)
print(text)

output:

Senku Ishigami is charecter from a manga [a-z] onging since 2017.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563

2 Answers2

1

You might also use a capturing group capturing only a single char before matching 1+ digits.

In the replacement using group 1 using \1

([a-z])\d+\b

regex demo

import re

text ='Senku Ishigami is charecter from a manga series98 onging since 2017.'
text = re.sub(r'([a-z])\d+\b', r'\1', text)
print(text)

Output

Senku Ishigami is charecter from a manga series onging since 2017.
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

You can use

import re

text ='Senku Ishigami is charecter from a manga series98 onging since 2017.'
text = re.sub(r'(?<=[a-z])\d+\b', '', text)
print(text) # => Senku Ishigami is charecter from a manga series onging since 2017.

See the regex demo and a Python demo.

Regex details

  • (?<=[a-z]) - a location immediately preceded with a lowercase ASCII letter
  • \d+ - one or more digits
  • \b - word boundary (the digits will only be matched at the end of a word).
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563