0

The \G modifier:

\G forces the pattern to only return matches that are part of a continuous chain of matches. From the first match each subsequent match must be preceded by a match.

However, in the python standard re module, \G is not an available anchor. Is there a way to convert what might have been done with a \G into a python-friendly regex? What would be an example of a regex containing a \G flag that could be re-written in a regex that would compile in python?

samuelbrody1249
  • 4,379
  • 1
  • 15
  • 58
  • 1
    You can use `PyPi regex` module instead of standard `re` module to use `\G`, otherwise, you'll likely have to use your regex in a loop. – ctwheels Nov 11 '19 at 19:25
  • @ctwheels Yea, I looked through this -- https://pypi.org/project/regex/ -- and it offers a lot! I was just wondering how `\G` could be used (or substituted for) if possible with using the standard `re` module. – samuelbrody1249 Nov 11 '19 at 19:26
  • 1
    There's no direct substitution other than constructing a really long regex with all the possibilities, or generating it using code. The best alternative with the standard `re` module is just to loop. – ctwheels Nov 11 '19 at 19:28
  • @ctwheels I see -- in that case, do you just want to post your comments in an answer below? Option 1 -- use ths regex module; Option 2 -- use this basic `for` loop, for example... – samuelbrody1249 Nov 11 '19 at 19:30

1 Answers1

2

There is no alternative for \G in the standard re module. Your options are as follows:

I created the example below to assume someone wants to find the digits at the start of a string (individually) - simple but I think it illustrates the issue quite well. Match each digit at the start of the string until a non-digit character appears, then hard stop.

PyPi regex

You can use the PyPi regex module (or some other module that supports it) instead.

See code in use here

import regex
s = '12x34x56'
r = r'\G\d'
print(regex.findall(r,s))

Output:

['1', '2']

Use-case specific standard regex

Without using the PyPi regex module, you'd have to come up with stricter rules or change your logic altogether. The logic drastically changes depending on the usage of \G:

See code in use here

import re
s = '12x34x56'
r = r'\d+'
m = re.match(r,s)
print(list(m[0]))

Output

['1', '2']
ctwheels
  • 21,901
  • 9
  • 42
  • 77