-1

So if I create a program in python (3.7) that looks like this:

import re
regx = re.compile("test")
print(regx.findall("testest"))

and run it, then I will get:

["test"]

Even though there are two instances of "test" it's only showing me one which I think is because a letter from the first "test" is being used in the second "test". How can I make a program that will give me ["test", "test"] as a result instead?

user8408080
  • 2,428
  • 1
  • 10
  • 19

2 Answers2

5

You will want to use a capturing group with a lookahead (?=(regex_here)):

import re
regx = re.compile("(?=(test))")
print(regx.findall("testest"))

>>> ['test', 'test']
Spencer Wieczorek
  • 21,229
  • 7
  • 44
  • 54
-1

Regex expressions are greedy. They consume as much of the target string as possible. Once consumed, a character is not examined again, so overlapping patterns are not found.

To do this you need to use a feature of python regular expressions called a look ahead assertion. You will look for instances of the character t where it is followed by est. The look ahead does not consume parts of the string.

    import re

    regx = re.compile('t(?=est)')

    print([m.start() for m in regx.finditer('testest')])

[0,3]

More details on this page: https://docs.python.org/3/howto/regex.html

soundstripe
  • 1,454
  • 11
  • 19