Split and keep deliminator, preferably with regex

Question

Let's say I have this text:

1.1 This is the 2,1 first 1.2 This is the 2,2 second 1.3 This is the 2,3 third

and I want:

["1.1 This is the 2,1 first","1.2 This is the 2,2 second","1.3 This is the 2,3 third"]

Note that:

I can't use re.findall, since I can't think of a way to properly terminate the match. The best I could think of was '[0-9]+\.[0-9]+^([0-9]+\.[0-9]+)*', which didn't work.
I can't just store the delimiter as a global variable, since it changes with each match.
I could not use a regular re.split because I want to keep the delimiter. I can't use a lookbehind because it has to be fixed width, and this isn't.

I have read regexp split and keep the seperator, Python split() without removing the delimiter, and In Python, how do I split a string and keep the separators?, and still don't have an answer.

But you *don't* keep the delimiter (the space) on which you're splitting. — jonrsharpe, Oct 13 '16 at 19:35

score 2 · Answer 1 · answered Oct 13 '16 at 19:34

Yes, you can:

\b\d+\.\d+
.+?(?=\d+\.\d+|$)

See it working on regex101.com. To be used in addition to re.findall():

import re
rx = re.compile(r'\b\d+\.\d+.+?(?=\d+\.\d+|$)')
string = "1.1 This is the 2,1 first 1.2 This is the 2,2 second 1.3 This is the 2,3 third "
matches = rx.findall(string)
print(matches)
# ['1.1 This is the 2,1 first ', '1.2 This is the 2,2 second ', '1.3 This is the 2,3 third ']

If the string spans across multiple lines, use either the dotall mode or [\s\S]*?.
See a demo on ideone.com.

score 0 · Answer 2 · answered Oct 14 '16 at 06:10

0

split with blank whose right is 1.2 2.2 ...

re.split(r' (?=\d.\d)',s)

answered Oct 14 '16 at 06:10

zxy

148
1
2

Split and keep deliminator, preferably with regex

2 Answers2