Questions tagged [positive-lookahead]

Positive Lookahead is a zero-length assertion used in Regular Expressions (RegEx). What this means is that it looks forward in the string to see if there is a match, but it doesn't consume the match. To use this tag, the question must be about RegEx where the main focus is on Positive Lookahead. The programming language used to execute RegEx should be mentioned as well.

Regular Expressions is checking a string to see if it matches a pattern. Positive Lookahead is a pattern that looks forward in the string to see if there is a match, but it doesn't consume the match, so it is regarded as a zero-length assertion. For example, if one wishes to check in a string to see if the character 'a' is followed by the character 'b', then the pattern would be

a(?=b)

This pattern would match:

  • about
  • absolute
  • fabulous

but wouldn't match

  • anything
  • band

The lookahead can be any amount of characters or a regex pattern. Expanding on the previous example:

a(?=bo)

This pattern would match:

  • about

but wouldn't match

  • absolute
  • fabulous

Since Positive Lookahead doesn't consume the match, to store the match place capturing parenthesis around the lookahead pattern, like so:

a(?=(bo))

The lookahead match is then stored for retrieval.

References

31 questions
4
votes
2 answers

Splitting sentences on space that follows a non-fixed length expression

Given the following text: text = "Van der Weyden was preoccupied by commissioned portraiture towards the end of his life[1] and was highly regarded by later generations of painters for his penetrating evocations of character. In this work, the…
Aiha
  • 41
  • 7
3
votes
1 answer

How to rewrite regexp without positive look-ahead?

The regex [^a-z0-9%*][a-z0-9%]{3,}(?=[^a-z0-9%*]) is not supported by Rust's default regex create due to positive look-ahead (?=): ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ regex parse error: …
4ntoine
  • 19,816
  • 21
  • 96
  • 220
3
votes
2 answers

Java regex positive look-ahead but match unique characters only?

I'm trying to match a String input with the criteria below: The first characters are unique lowercase English letters The next characters are the represent the current year from 1500 to 2020 The next characters can only be 10, or 100, or 1000 The…
ennth
  • 1,698
  • 5
  • 31
  • 63
3
votes
2 answers

REGEX: Select KeyWord1 if KeyWord2 is in the same string

I am trying to capture KEYWORD1 in .NET regex engine based on whether KeyWord2 is present in the string. So far the positive look-around solution I am using: (?=.*KeyWord2)**KEYWORD1** (\m\i) RegEx Test Link only captures KEYWORD1 if KeyWord2 is…
BWEL
  • 41
  • 4
3
votes
1 answer

Regex lookahead logical 'OR' - to exclude certain patterns

I've seen a lot of examples of password validation that do a logical AND. For example, password must have AT LEAST one digit (AND) AT LEAST one character (AND) length between 6 and 15 This can be written with regex 'positive…
joedotnot
  • 4,810
  • 8
  • 59
  • 91
2
votes
2 answers

Regex includes Lookahead strings in selection

I'm trying to extract the degree (Mild/Moderate/Severe) of an specific type heart dysfunction (diastolic dysfunction) from a huge number of echo reports. Here is the link to the sample excel file with 2 of those echo reports. The lines are usually…
2
votes
2 answers

extracting word before character

I am trying to extract any word before Y which is boundary separated. As I am trying to consider each line as a separate record using (?m) flag and trying to capture \w+ which is look ahead by \s+Y ,but I am only able to print 1st match, not the 2nd…
monk
  • 1,953
  • 3
  • 21
  • 41
2
votes
1 answer

Positive lookahead doesn't match Arabic text

Regex doesn't match Arabic text when using lookahead assertion I am trying to split the text: شكرا لك على المشاركة في هذه الدراسة. هذا الاستبيان يطلب معلومات عن: stored in $sentences = "شكرا لك على المشاركة في هذه الدراسة. هذا الاستبيان يطلب…
msoutopico
  • 357
  • 3
  • 15
1
vote
2 answers

Understanding regex lookaround to get desired result

I am trying to isolate street address fields that begin with a digit, contain an underscore and end with a comma: 001 ALLAN Witham Ross 13 Every_Street, Welltown Greenkeeper 002 ALLARDYCE Margaret Isabel 49 Bell_Road, Musicville Housewife 003…
Dave
  • 687
  • 7
  • 15
1
vote
2 answers

Convert regex positive look ahead to sed operation

I would like to sed to find and replace every occurrence of - with _ but only before the first occurrence of = on every line. Here is a dataset to work…
Dave
  • 727
  • 1
  • 9
  • 20
1
vote
1 answer

postgres regex positive lookahead is not working as expected

I want to capture tokens in a text in the following pattern: The First 2 characters are alphabets and necessary, ends with [A-Z] or [A-Z][0-9] this is optional anything can come in between. example: AA123123A1 AA123123A AA123123123 i want to match…
1
vote
1 answer

How to split look-ahead regex into 2 plain regexes?

I have a look-ahead regex [^a-z0-9%*][a-z0-9%]{3,}(?=[^a-z0-9%*]). In my test it extracts 4 substrings from @@||imasdk.googleapis.com/js/core/bridge*.html: |imasdk .googleapis .com /core I need to rewrite it with 2 good-old regexes as i can't use…
4ntoine
  • 19,816
  • 21
  • 96
  • 220
1
vote
1 answer

Pull last 3 charecters from Regex positive lookhead match

I have the following regex expression ^.*?(?=\.) and i'm looking to modify it so that i can pull the 3 charecter country code (bold) from a given path. Right now the regex just creates a group before the period.…
sfasu77
  • 35
  • 5
1
vote
1 answer

Syntax for Lookahead and Lookbehind in Grok Custom Pattern

I'm trying to use a lookbehind and a lookahead in a Grok custom pattern and getting pattern match errors in the Grok debugger that I cannot resolve. This is for archiving system logs. I am currently trying to parse the postgrey application. …
Toby
  • 245
  • 1
  • 10
1
vote
2 answers

How to use positive lookbehind assertions to extract substring from string following the word "named"

I have a pandas series of text from tweets. The tweets are about dogs. Some of the tweets contain the dog's name. The name shows up in the following way. "...blah blah blah named name. blah blah blah..." Unknown number of characters before and after…
a2fet
  • 37
  • 1
  • 6
1
2 3