-7

Suppose I have a string such as :

string = 'This string 22 is not yet perfect1234 and 123pretty but it can be.'

I want to remove any numbers which are mixed with words, such as 'perfect1234' and '123pretty', but not '22', from my string and get an output as follows:

string = 'This string 22 is not yet perfect and pretty but it can be.'

Is there any way to do this in Python using regex or any other method? Any help would be appreciated. Thank you!

danielhadar
  • 2,031
  • 1
  • 16
  • 27
PJay
  • 2,557
  • 1
  • 14
  • 12
  • see here: http://stackoverflow.com/questions/12851791/removing-numbers-from-string – danielhadar Aug 01 '16 at 13:02
  • 1
    Seems like OP wants to eliminate only digits that are part of words, not any digits in the string. (Word boundary matters) – Keozon Aug 01 '16 at 13:03
  • 2
    However, it is not a well-asked question. @PJay, any patterns you tried? What issues do you have with your code? Also, check [How to ask](http://stackoverflow.com/help/how-to-ask). – Wiktor Stribiżew Aug 01 '16 at 13:04
  • Possible duplicate of [How to replace all occurrences of specific words in Python](http://stackoverflow.com/questions/25631695/how-to-replace-all-occurrences-of-specific-words-in-python) – Muhammad Yaseen Khan Aug 01 '16 at 13:09
  • I want to remove any digits that get mixed up with letters or any other non alphanumeric character such as '700/-'. In fact I also wanted to remove all numbers from my string such as phone numbers of the format '+91 ....'. Could you help me with this format as well? Any kind of digits appearing in my string need to be removed. – PJay Aug 02 '16 at 06:22

6 Answers6

5
s = 'This string 22 is not yet perfect1234 and 123pretty but it can be.'

new_s = ""
for word in s.split(' '):
    if any(char.isdigit() for char in word) and any(c.isalpha() for c in word):
        new_s += ''.join([i for i in word if not i.isdigit()])
    else:
        new_s += word
    new_s += ' '

And as a result:

'This string 22 is not yet perfect and pretty but it can be.'
danielhadar
  • 2,031
  • 1
  • 16
  • 27
  • More complicated than a regex (IMO), but probably faster in Python. Good answer, and I think more targeted to the OP's original intent. – Keozon Aug 01 '16 at 13:25
5

If you want to preserve digits that are by themselves (not part of a word with alpha characters in it), this regex will do the job (but there probably is a way to make it simpler):

import re
pattern = re.compile(r"\d*([^\d\W]+)\d*")
s = "This string is not yet perfect1234 and 123pretty but it can be. 45 is just a number."
pattern.sub(r"\1", s)
'This string is not yet perfect and pretty but it can be. 45 is just a number.'

Here, 45 is left because it is not part of a word.

Keozon
  • 998
  • 10
  • 25
-1
import re
re.sub(r'\d+', '', string)
Kane Blueriver
  • 4,170
  • 4
  • 29
  • 48
  • should use raw string literals for regex `r'\d+'` and this doesn't check if the numbers are part of a word also containing alpha characters (which seems to be the intent) – Keozon Aug 01 '16 at 13:14
  • @Keozon Yes, raw string is better, I would change my answer. But what do you mean ''numbers are part of a word ", can you give an example? – Kane Blueriver Aug 01 '16 at 13:17
  • Thank you for your help! I don't want to keep anything that has the following format in my string : '700/-', '+91 1234567891', '3appeared' , 'Vora02261794300Will'. Numbers or words such as the last two in the example should not be present in the string after processing. – PJay Aug 02 '16 at 06:32
-1

The code below checks each character for a digit. If it isn't a digit, it adds the character to the end of the corrected string.

string = 'This string is not yet perfect1234 and 123pretty but it can be.'

CorrectedString = ""
for characters in string:
    if characters.isdigit():
        continue
    CorrectedString += characters
yarz-tech
  • 284
  • 2
  • 6
  • 18
-1

You can try this by simply join function and as well as nothing to import

str_var='This string 22 is not yet perfect1234 and 123pretty but it can be.'

str_var = ' '.join(x for x in str_var.split(' ') if x.isdigit() or x.isalpha())
print str_var

output:

'This string 22 is not yet perfect and pretty but it can be.'
Anand Tripathi
  • 14,556
  • 1
  • 47
  • 52
-1

print(''.join(x for x in strng if not x.isdigit()).replace(' ',' '))

p.s. after removing digits..replace double space(s) with single space(s)

output:

This string is not yet perfect and pretty but it can be.