0

I have a numerical string which sometimes it contains letters. I must delete letters and everything after that from my input string. I have tried:

import re
b = re.split('[a-z]|[A-Z]', a)
b = b[0]

But this returns error whenever the string doesn't contain letters.

Do I need to check if a contains letters before trying to split at their point?

How can I check that?

Two examples:

a = '1234' 

and

a = '1234h'

I want to have b = '1234' after both


Another example:
I want a= '4/200, 3500/ 500 h3m' or a= '4/200, 3500/ 500h3m' to return something like:

b= ['4', '200', '3500', '500']
martineau
  • 119,623
  • 25
  • 170
  • 301
m123
  • 113
  • 6
  • 2
    Why do you have a one-element list containing a string, rather than just a string? – alani Aug 23 '20 at 21:38
  • Try [this](https://stackoverflow.com/questions/1450897/remove-characters-except-digits-from-string-using-python) out ? – Delrius Euphoria Aug 23 '20 at 21:38
  • what is the error? – Abdeslem SMAHI Aug 23 '20 at 21:38
  • Anyway, you want `re.sub` and the pattern should match a letter followed by any amount of other characters (as much as possible), and the replacement should be the empty string. – alani Aug 23 '20 at 21:40
  • if OP is indeed using a list with a single string in it, that is throwing the error, since re will show the following error then: TypeError: expected string or bytes-like object. Iterate over the list or don't use a list at all – Leander Aug 23 '20 at 21:42
  • 1
    @Cool Cloud: Thank you. But I need to delete even all the digits after occurrence of any letters (if any letter exist). The link you sent, deletes all letters and keeps all digits. – m123 Aug 23 '20 at 21:55
  • @ alaniwi : You right, I checked. The [ ] s were added after my tests and computations for splitting. I omitted [ ] from the input. Thanks. – m123 Aug 23 '20 at 22:38

2 Answers2

1
import re
match = re.search('^[\d]+', '1234h')
if match:
    print(match.group(0))

It will return '1234' for '1234' and '1234h'. It find series of digits after starting and ignores after letter.

Yuri R
  • 311
  • 2
  • 9
0
list = ['1234abc' , '5278', 'abc58586']

def range_sanitizer(s, lower_bound, upper_bound):
    return ''.join([x for x in s if lower_bound < x < upper_bound])

def remove_non_digits(s):
    return range_sanitizer(s, '0', '9')

def sanitize_list(list, sanitizer=remove_non_digits):
    return [sanitizer(item) for item in list]

if '__main__' == __name__:
    sanitized_list=sanitize_list(list)

    # ['1234', '5278', '58586']
    print(sanitized_list)
Aviv Yaniv
  • 6,188
  • 3
  • 7
  • 22