5

I'm splitting a string with some separator, but want the separator matches as well:

import re

s = "oren;moish30.4.200/-/v6.99.5/barbi"
print(re.split("\d+\.\d+\.\d+", s))
print(re.findall("\d+\.\d+\.\d+", s))

I can't find an easy way to combine the 2 lists I get:

['oren;moish', '/-/v', '/barbi']
['30.4.200', '6.99.5']

Into the desired output:

['oren;moish', '30.4.200', '/-/v', '6.99.5', '/barbi']
OrenIshShalom
  • 5,974
  • 9
  • 37
  • 87

4 Answers4

4

Another solution (regex101):

s = "oren;moish30.4.200/-/v6.99.5/barbi"

x = re.findall(r"\d+\.\d+\.\d+|.+?(?=\d+\.\d+\.\d+|\Z)", s)
print(x)

Prints:

['oren;moish', '30.4.200', '/-/v', '6.99.5', '/barbi']
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
4

From the re.split docs:

If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list.

So just wrap your regex in a capturing group:

print(re.split(r"(\d+\.\d+\.\d+)", s))
user2357112
  • 260,549
  • 28
  • 431
  • 505
3

Try this:

import re
s = "oren;moish30.4.200/-/v6.99.5/barbi"
print([x for y in re.findall(r"(?:([A-Za-z;\/-]+)|(\d+\.\d+\.\d+))", s) for x in y if x])

Result:

['oren;moish', '30.4.200', '/-/v', '6.99.5', '/barbi']
Cow
  • 2,543
  • 4
  • 13
  • 25
0

You could use re.findall and a pattern to match:

\d+\.\d+\.\d+|\D+(?:\d(?!\d*\.\d+\.\d)\D*)*

Explanation

  • \d+\.\d+\.\d+ Match 3 times 1+ digits with a single dot in between
  • | Or
  • \D+ Match 1+ chars other than a digit
  • (?: Non capture group to repeat as a whole part
    • \d(?!\d*\.\d+\.\d) Match a single digit asserting not the digits and dots pattern to the right
    • \D* Match optional chars other than a digit
  • )* Close the non capture group and optionally repeat it

See a regex demo.

EXample

import re

s = "oren;moish30.4.200/-/v6.99.5/barbi"
pattern = r"\d+\.\d+\.\d+|\D+(?:\d(?!\d*\.\d+\.\d)\D*)*"
print(re.findall(pattern, s))

Output

['oren;moish', '30.4.200', '/-/v', '6.99.5', '/barbi']
The fourth bird
  • 154,723
  • 16
  • 55
  • 70