0

I need help with a regex that should match a fixed length pattern.

For example, the following regex allows for at most 1 ( and 1 ) in the matched pattern:

([^)(]*\(?[^)(]*\)?[^)(]*)

However I can not / do not want to use this solution because of the *, as the text I have to scan through is very large using it seems to really affect the performance.

I thus want to impose a match length limit, e.g. using {10,100} for example.

In other words, the regex should only match if

  • there are between 0 and 1 set of parentheses inside the string
  • the total length of the match is fixed, e.g. not infinite (No *!)

This seems to be a solution to my problem, however I do not get it to work and I have trouble understanding it. I tried to use the accepted answer and created this:

^(?=[^()]{5,10}$)[^()]*(?:[()][^()]*){0,2}$

which does not seem to really work as expected: https://regex101.com/r/XUiJZz/1

I am unable to make use of the kleene star operator.


Edit

I know this is a possible solution, but I'm wondering if there is a better way to do it:

([^)(]{0,100}\(?[^)(]{0,100}\)?[^)(]{0,100})

halfer
  • 19,824
  • 17
  • 99
  • 186
charelf
  • 3,103
  • 4
  • 29
  • 51
  • What are your rules for matching? – anubhava Jun 18 '19 at 10:12
  • What do you mean by rules? The regex is part of a bigger regex – charelf Jun 18 '19 at 10:13
  • it is a bit too complex to explain in a comment, basically I'm trying to extract inline citations from a scientifc text, and part of the string should not contain more than one set of parenthese to avoid the regex becoming to greedy, as parentheses are one of the only delimiters I can use – charelf Jun 18 '19 at 10:25
  • It is always the same: you write a regex that matches your desired format. Then, you add `(?=.{your_min, your_max_threshold}$)` after `^`. – Wiktor Stribiżew Jun 18 '19 at 10:25
  • I read through your answer. There is one problem left, namely I cannot really use ```^``` or ```$```. Assuming I can use a simple ```x``` as delimiter, would it still work when I replace ```^``` and ```$``` by ```x``` and ```x```? – charelf Jun 18 '19 at 10:31

1 Answers1

1

I thus want to impose a match length limit, e.g. using {10,100}

You may want to anchors add a lookahead assertion in your regex:

^(?=.{10,100})[^)(]*(?:\(?[^)(]*\))?[^)(]*$

(?=.{10,100}) is lookahead condition to assert that length of string must be between 10 and 100.

RegEx Demo

anubhava
  • 761,203
  • 64
  • 569
  • 643