0

I am currently starting with regex and try to understand some commands. One problem I stumbled upon is that I can use a positive lookahead like

asdaf = re.compile(r'[1-9](?=u)')
asdaf.findall("rm4m3455ukdfr6i")

which results in:

['5']

So far so good. However, if extend the numbers to more then one

bsdaf = re.compile(r'[1-9]*(?=u)')
bsdaf.findall("rm4m3455ukdfr6i")

I get this list

['3455', '']

Can someone explain why there is a second empty entry in the list and how to avoid this second entry?

Thanks in advance!

Jay
  • 19
  • 2
  • 1
    `*` means *0 or more*, i.e. `[1-9]*` can match an empty string. Use `+`. Never use empty string matching patterns if you do not expect empty matches in the output. – Wiktor Stribiżew Jul 28 '20 at 21:07
  • 1
    The first match is `3455`. The regex's internal string pointer is then located between `'5'` and `'u'`. The next match is then attempted. The lookahead `(?=u)` is again satisfied, as is `[1-9]*` with a zero-width (empty string) match. – Cary Swoveland Jul 28 '20 at 21:32

0 Answers0