0

So I have this regular expression

[a+][a-z-[a]]{1}[a+]

which will match string "aadaa"

but it will also match string "aaaaaaaadaa"

Is there any way to force it to match only those strings in which left side a's and right side a's occurrence count should be same?

so that it will match only "aadaa" and not this "aaaaaaaadaa"

Edit

With the help of Peter's answer I could make it working, this is the working version for my requirement

(a+)[a-z-[a]]{1}\1
Pawan Nogariya
  • 8,330
  • 12
  • 52
  • 105

2 Answers2

5

You can use a back reference, as follows:

console.log(check("ada"));
console.log(check("aadaa"));
console.log(check("aaaaaaaadaa"));
console.log(check("aaadaaaaaaa"));

function check(str) {
  var re = /^(.*).\1$/;
  return re.test(str);
}

Or to only match a's and d's:

console.log(check("aca"));
console.log(check("aadaa"));
console.log(check("aaaaaaaadaa"));
console.log(check("aaadaaaaaaa"));

function check(str) {
  var re = /^(a*)d\1$/;
  return re.test(str);
}

Or to only match a's that surround not-an-a:

console.log(check("aca"));
console.log(check("aadaa"));
console.log(check("aaaaaaaadaa"));
console.log(check("aaadaaaaaaa"));

function check(str) {
  var re = /^(a*)[b-z]\1$/;
  return re.test(str);
}

I realize all the above is javascript, which was easy for quick demoing within the context of SO.

I made a working DotNetFiddle with the following C# code that is similar to all the above:

public static Regex re = new Regex(@"^(a+)[b-z]\1$");

public static void Main()
{
    check("aca");
    check("ada");
    check("aadaa");
    check("aaddaa");
    check("aadcaa");
    check("aaaaaaaadaa");
    check("aadaaaaaaaa");
}

public static void check(string str)
{
    Console.WriteLine(str + " -> " + re.IsMatch(str));
}
Peter B
  • 22,460
  • 5
  • 32
  • 69
  • Thanks @peter. It seems working but not exactly how I want it to. I have finally updated it to this `([a+])[a-z-[a]]{1}\1` and my string is `bcaabaaaaeg` but it is matching `aba` while it should match `aabaa`, no? Could you please tell me what is the mistake I am doing – Pawan Nogariya Mar 18 '19 at 13:35
  • But could you please point out what was the mistake with this `([a]+)[a-z-[a]]{1}\1` – Pawan Nogariya Mar 18 '19 at 13:39
  • `([a]+)` and `(a+)` should match exactly the same, the first has square brackets that can be left out because they contain only one character to match. Maybe you made a typo while trying it. – Peter B Mar 18 '19 at 13:53
  • @PawanNogariya You missed both `^` and `$` anchors. – revo Mar 18 '19 at 14:05
  • @PeterB - Yeah, I was also thinking the same. Could you please also tell me how can I make it select the common groups also? I mean for this string `abcbaba` it returns `bcb` as a match while I am expecting it to return two matches `bcb` and `bab`. I know since the last `b` is common in both the matches and so it is matching only that once but is there any way to get all the matches even when they are in common? – Pawan Nogariya Mar 18 '19 at 14:34
  • It's better to post that as a new question, so this one doesn't get all messy. – Peter B Mar 18 '19 at 16:03
  • I have never needed it before by using regex, but https://www.regular-expressions.info/balancing.html looks like something you´re looking for – user3104267 Mar 18 '19 at 22:10
0

You can also use the following regex for the same although I would prefer the one suggested by @PeterB

console.log(check("aca"));
console.log(check("aadaa"));
console.log(check("aaaaaaaadaa"));
console.log(check("aaadaaaaaaa"));

function check(str) {
  var re = /^(\w+)[A-Za-z]\1$/;
  return re.test(str);
}

The code is similar to the one in Peter B's answer, but the regex is the one changed by me.

Code_Ninja
  • 1,729
  • 1
  • 14
  • 38