2

Can someone explain to me, why/when I should use positive lookarounds in Regex? For negative lookarounds I can think of scenarios where they are the only solution, but for positive lookarounds I don't see why to use them, when their result can also be produced by using capture groups.

For example:

Input: Bus: red, Car: blue

I want to color of the car.

With lookaround: (?<=Car: )\w+ With capture group: Car: (\w+)

Both Regex archive the same result - direct access to the color-match. So are there cases which can only be solved by positive lookarounds?

netblognet
  • 1,951
  • 2
  • 20
  • 46

2 Answers2

2

Positive lookarounds may be useful when imposing additional conditions on a pattern without affecting the main pattern.

For instance, you may need to check if a string contains a red bus or a blue car and is no longer than 20 charactes:

^(?=.{0,20}$).*(?:Bus: red|Car: blue)

Demo: https://regex101.com/r/iQ4uL4/1

More examples of this sort can be found here: https://stackoverflow.com/a/21456918/4295017

Community
  • 1
  • 1
Dmitry Egorov
  • 9,542
  • 3
  • 22
  • 40
  • Just one more question. Why do you write the lookahead in front of the main regex-pattern? I thought I had to write lookaheads as also lookbehinds after/right to the regex pattern to that it refers?! – netblognet Aug 09 '16 at 19:07
  • @netblognet, actually, it's after the `^` pattern, so this `^` is the pattern the lookahead is tied to. – Dmitry Egorov Aug 10 '16 at 03:37
2

PCRE is used not only in PHP, the library is used in a variety of tools and languages, where you do not always have easy access to captured groups.

In some of them, a lookbehind is the easiest way to, say, split a string (with strsplit in R), or work around the problems with accessing submatches.

PCRE lookbehind is "crippled" in a way, that is, it is fixed-width, and is thus really not that full-fledged. However, here is an interesting case: a positive lookbehind is used after the match increasing performance: \d{3}(?<=USD\d{3}). Here, the check only starts after we matched 3 digits, no need to check U, then S, then D, then digits.

As for a positive lookahead, it is used in a lot of scenarios:

  • Set conditions on the string matched (see Dmitry's answer, also e.g. ^(?=.*\d) will require at least 1 digit in the string)
  • Overlapping matches are possible (e.g. -\d+(?=-|$) will find 3 matches in -1-2-3)
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563