0

I want to match digits betwen "000" or betwen \b and "000" or "000" and \b from a string like this:

11101110001011101000000011101010111

I have tried with expressions like this:

(?<=000)\d+(?=000)

but I only get the largest occurrence

I expect to get:

1110111
1011101
0
11101010111
Vikash Chauhan
  • 792
  • 2
  • 9
  • 18

2 Answers2

1

You can use the regex package and the .findall() method:

In [1]: s = "11101110001011101000000011101010111"

In [2]: import regex

In [3]: regex.findall(r"(?<=000|^)\d+?(?=000|$)", s)
Out[3]: ['1110111', '1011101', '0', '00011101010111']

The 000|^ and 000|$ would help to match either the 000 or the beginning and the end of a string respectively. Also note the ? after the \d+ - we are making it non-greedy.

Note that the regular re.findall() would fail with the following error in this case:

error: look-behind requires fixed-width pattern

This is because re does not support variable-length lookarounds but regex does.

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
1

you can do it with the re module like this:

re.findall(r'(?:\b|(?<=000))(\d+?)(?:000|\b)', s)
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125