separate number from the string

Question

the separate number from the string, but when successive '1', separate them

I think there must have a smart way to solve the question.

s = 'NNNN1234N11N1N123'

expected result is:

['1234','1','1','1','123']

Shouldn't it be `['1234','1','1','123']` instead of `['1234','1','1','23']`? — dcg, Aug 08 '19 at 04:55
re.findall("\d+", i): https://stackoverflow.com/questions/26825729/extract-number-from-string-in-python/26825781 — Jainil Patel, Aug 08 '19 at 05:08

dcg · Accepted Answer · 2019-08-08T14:42:09.040

2

I think what you want can be solved by using the re module

>>> import re
>>> re.findall('(?:1[2-90]+)|1', 'NNNN1234N11N1N123')

EDIT: As suggested in the comments by @CrafterKolyan the regular expression can be reduced to 1[2-90]*.

Outputs

['1234', '1', '1', '1', '123']

edited Aug 08 '19 at 14:42

answered Aug 08 '19 at 04:58

dcg

4,187
1
18
32

Thank you , I'm not familiar with re, I will study your code, appreciate it, THX. – Todd Aug 08 '19 at 04:59
Well, take a look into regular expressions (and the `re` module of course), they are very useful! – dcg Aug 08 '19 at 05:01
Yeah. I will. I reedit the question, forgot something, Thank you. – Todd Aug 08 '19 at 05:04
I fixed the regular expression, so the result is as expected. – dcg Aug 08 '19 at 05:10
2

Regex can be easier `1[2-90]*` – CrafterKolyan Aug 08 '19 at 05:42
re.findall('(1[2-90]+)|1', 'NN1234**12222311*1112N12311NN123456**1222') return ['1234', '122223', '', '', '', '', '12', '123', '', '', '123456', '1222'] '1' is missing. What happend :) – Todd Aug 08 '19 at 06:14
I think you missed the `(?:` – dcg Aug 08 '19 at 06:29

score 0 · Answer 2 · edited Aug 12 '19 at 16:39

I also would use regular expressions (re module), but other function, namely re.split following way:

import re
s = 'NNNN1234N11N1N123'
output = re.split(r'[^\d]+|(?<=1)(?=1)',s)
print(output) # ['', '1234', '1', '1', '1', '123']
output = [i for i in output if i] # jettison empty strs
print(output) # ['1234', '1', '1', '1', '123']

Explanation: You want to split str to get list of strs - that is for what re.split is used. First argument of re.split is used to tell where split should happen, with everything which will be matched being removed if capturing groups are not used (similar to str method split), so I need to specify two places where cut happen, so I used | that is alternative and informed re.split to cut at:

[^\d]+ that is 1 or more non-digits
(?<=1)(?=1) that is empty str preceded by 1 and followed by 1, here I used feature named zero-length assertion (twice)

Note that re.split produced '' (empty str) before your desired output - this mean that first cut (NNNN in this case) spanned from start of str. This is expected behavior of re.split although we do not need that information in this case so we can jettison any empty strs, for which I used list comprehension.

separate number from the string

2 Answers2