2

In Python, I am trying to split a string until an occurence of an integer, the first occurence of integer will be included, rest will not.

Example strings that I will have are shown below:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---
SOME OTHER STRING (PARANTHESIS AGAIN) 5 --- 3
AND SOME OTHER (AGAIN) 2 1 4

And the outputs that I need for these examples are going to be:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2
SOME OTHER STRING (PARANTHESIS AGAIN) 5
AND SOME OTHER (AGAIN) 2

Structure of all input strings will be in this format. Any help will be appreciated. Thank you in advance.

I've basically tried to split it with using spaces (" "), but it of course did not work. Then, I tried to split it with using "---" occurence, but "---" may not exist in every input, so I failed again. I also referred to this: How to split a string into a string and an integer? However, the answer suggests to split it using spaces, so it didn't help me.

codeine
  • 58
  • 6

3 Answers3

2

It's ideal case for regular expression.

import re

s = "SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---"
m = re.search(r".*?[0-9]+", s)
print(m.group(0))

Explanation:

  • .* matches any number of characters
  • ? tells to not be greedy (without it it will stop in last integer)
  • [0-9]+ - matches one or more digits

It can be done without regular expressions too:

result = []
for word in s.split(" "):
    result.append(word)
    if word.isdigit(): # it returns True if string can be converted to int
        break
print(" ".join(result))
kosciej16
  • 6,294
  • 1
  • 18
  • 29
  • 2
    Are we sure the numbers are single digit? – MYousefi Nov 07 '22 at 23:28
  • Well, it works. Thank you very much. Is there any other way that I can solve this question without using regular expressions? If not, I am going to study a bit of them. – codeine Nov 07 '22 at 23:29
1

Solution without re:

lst = [
    "SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---",
    "SOME OTHER STRING (PARANTHESIS AGAIN) 5 --- 3",
    "AND SOME OTHER (AGAIN) 2 1 4",
]

for item in lst:
    idx = next(idx for idx, ch in enumerate(item) if ch.isdigit())
    print(item[: idx + 1])

Prints:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2
SOME OTHER STRING (PARANTHESIS AGAIN) 5
AND SOME OTHER (AGAIN) 2
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
0

try the following regular expression:

import re
r = re.compile('(\D*\d+).*')
r.match('SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 -').groups()[0]
==> 'SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2'
Alberto Garcia
  • 324
  • 1
  • 11