0

For example, this is the regular expression

([a]{2,3})

This is the string

aaaa // 1 match "(aaa)a" but I want "(aa)(aa)"
aaaaa // 2 match "(aaa)(aa)"
aaaaaa // 2 match "(aaa)(aaa)"

However, if I change the regular expression

([a]{2,3}?)

Then the results are

aaaa // 2 match "(aa)(aa)"
aaaaa // 2 match "(aa)(aa)a" but I want "(aaa)(aa)"
aaaaaa // 3 match "(aa)(aa)(aa)" but I want "(aaa)(aaa)"

My question is that is it possible to use as few groups as possible to match as long string as possible?

tripleee
  • 175,061
  • 34
  • 275
  • 318
Joshua
  • 5,901
  • 2
  • 32
  • 52
  • Your examples do not generalize to "use as few groups as possible". The smallest possible number of groups is zero, trivially; the answer is then simply `a*`. – tripleee Jul 28 '18 at 09:12

3 Answers3

1

How about something like this:

(a{3}(?!a(?:[^a]|$))|a{2})

This looks for either the character a three times (not followed by a single a and a different character) or the character a two times.

Breakdown:

(                   # Start of the capturing group.
    a{3}            # Matches the character 'a' exactly three times.
    (?!             # Start of a negative Lookahead.
        a           # Matches the character 'a' literally.
        (?:         # Start of the non-capturing group.
            [^a]    # Matches any character except for 'a'.
            |       # Alternation (OR).
            $       # Asserts position at the end of the line/string.
        )           # End of the non-capturing group.
    )               # End of the negative Lookahead.
    |               # Alternation (OR).
    a{2}            # Matches the character 'a' exactly two times.
)                   # End of the capturing group.

Here's a demo.

Note that if you don't need the capturing group, you can actually use the whole match instead by converting the capturing group into a non-capturing one:

(?:a{3}(?!a(?:[^a]|$))|a{2})

Which would look like this.

1

Try this Regex:

^(?:(a{3})*|(a{2,3})*)$

Click for Demo

Explanation:

  • ^ - asserts the start of the line
  • (?:(a{3})*|(a{2,3})*) - a non-capturing group containing 2 sub-sequences separated by OR operator
    • (a{3})* - The first subsequence tries to match 3 occurrences of a. The * at the end allows this subsequence to match 0 or 3 or 6 or 9.... occurrences of a before the end of the line
    • | - OR
    • (a{2,3})* - matches 2 to 3 occurrences of a, as many as possible. The * at the end would repeat it 0+ times before the end of the line

-$ - asserts the end of the line

Gurmanjot Singh
  • 10,224
  • 2
  • 19
  • 43
1

Try this short regex:

a{2,3}(?!a([^a]|$))

Demo

How it's made:

I started with this simple regex: a{2}a?. It looks for 2 consecutive a's that may be followed by another a. If the 2 a's are followed by another a, it matches all three a's.

This worked for most cases:

enter image description here

However, it failed in cases like:

enter image description here

So now, I knew I had to modify my regex in such a way that it would match the third a only if the third a is not followed by a([^a]|$). So now, my regex looked like a{2}a?(?!a([^a]|$)), and it worked for all cases. Then I just simplified it to a{2,3}(?!a([^a]|$)).

That's it.

enter image description here

EDIT

If you want the capturing behavior, then add parenthesis around the regex, like:

(a{2,3}(?!a([^a]|$)))

Wololo
  • 841
  • 8
  • 20
  • Rather than simply explaining how a regex is working, I prefer to explain how it's made (my thought process, kind-of). I guess, it's easier to understand how the regex is working once you know how it was made. – Wololo Jul 28 '18 at 08:06
  • This does not seems to work on all character. For example, I replace `a` with `=`. It does not work on `====`. – Joshua Jul 28 '18 at 08:30
  • The accepted answer also matches `a`'s only. That's why I guessed that you needed to match `a`'s only. – Wololo Jul 28 '18 at 08:50
  • No, the accepted answer actually accepts other characters. – Joshua Jul 28 '18 at 08:52
  • Woops, my bad. I'll modify it if I can, otherwise, I will delete the answer. Thanks – Wololo Jul 28 '18 at 08:55
  • @Joshua Try this a{2,3}(?!a([^a]|$)) – Wololo Jul 28 '18 at 09:56