2

For example if I have the following sentence:

a cat bit a dog on the butt before running away

If the 2 characters I want to use are 'a' and 'b' then I want to match up until the point where there are equal amounts of 'a' and 'b' like the following:

a cat bit a dog on the butt b

In the above case, the sentence has 5 a's and 3 b's. I want to much up to the point where I have 3 a's and 3 b's.

Thank you for the help.

Ark-of-Ice
  • 479
  • 2
  • 6
  • 21

2 Answers2

5

It's not possible.

an bn is not regular, thus matching with a regular expression is mathematically not possible, even with enhanced regular expressions.

You can use the following function to get the range which doesn't use a regular expression:

var input = "a cat bit a dog on the butt b";
console.log(getRange(input, "a", "b"));

function getRange(input, char1, char2){
  var indexStart = -1;
  var count1 = 0, count2 = 0;
  
  for(var i = 0; i < input.length; i++){
    var char = input[i];
    switch(char){
      case char1:
        count1 += 1; break
      case char2:
        count2 += 1; break;
    }
    if(char == char1 || char == char2){
      if(indexStart == -1)
        indexStart = i;

      if(count1 == count2)
        return [indexStart, i];
    }
  }
  
  return [-1, -1];
}
Derek 朕會功夫
  • 92,235
  • 44
  • 185
  • 247
  • In Javascript, perhaps not (not sure yet), but it can be done in some REs: https://stackoverflow.com/a/3644267/2557128 - common RE engines in languages are not limited to regular languages. – NetMage Dec 05 '17 at 01:17
  • @NetMage They are *enhanced* regular expressions which are not exactly regular expressions. JS also has enhanced regexps but it's not sophisticated enough to describe a^n b^n. – Derek 朕會功夫 Dec 05 '17 at 01:24
  • Given that `\1?+` is equivalent to `(?>\1?)` and that (some?) versions of `(?>...)` can be emulated with `(?=...)\1` I am not sure Javascript can't handle this yet. – NetMage Dec 05 '17 at 01:29
  • @NetMage I would love to see a regexp in JS that matches a^n b^n if that's possible! But even if it does, regexp should still not be used to do what OP's trying to do. – Derek 朕會功夫 Dec 05 '17 at 01:31
  • The OP's question is equivalent to matching nested brackets. I made a post recently showing this is possible using forward references, which are not supported in JavaScript, unfortunately. I too would like to see this done, but I fear it is impossible! – jaytea Dec 05 '17 at 08:35
  • @jaytea, that's exactly what I'm using it for, haha – Ark-of-Ice Dec 05 '17 at 16:27
  • @Ark-of-Ice Depending on what you trying to do, you might want to compile an actual parser. Matching nested brackets is a classical example of a language that is not regular. – Derek 朕會功夫 Dec 05 '17 at 16:34
  • @Derek朕會功夫 I'm using regex to examine js files of mine to help myself learn regular expressions better, so this is purely educational, but I'll look into it. I don't really know anything about compiling an actual parser. – Ark-of-Ice Dec 05 '17 at 16:40
  • @Ark-of-Ice My suggestion is to stop right there and discard the idea of using regular expressions to parse JS files. It's not possible and you will fail doing it. Mathematically speaking, you might need at least a [Turing machine](https://en.m.wikipedia.org/wiki/Recursively_enumerable_language) to be able to parse it correctly. – Derek 朕會功夫 Dec 05 '17 at 16:48
  • @Derek朕會功夫 I'm not using it in a professional setting. It's just an organic string to let me practice regular expressions on – Ark-of-Ice Dec 05 '17 at 16:56
1

var a = 0,
    b = 0;
var result = 0;
var patten = /./g;
for (;patten.exec("a cat bit a dog on the butt before running away") != null;) {
    if (RegExp['$&'] == "a") {
        a++;
  if (a == b) {
   result = patten.lastIndex;
        }
    }
    if (RegExp['$&'] == "b") {
        b++;
        if (a == b) {
            result = patten.lastIndex;
        }
    }
}
console.log("1~" + result);
console.log("a cat bit a dog on the butt before running away".slice(0,result));
by the way, 朕也会功夫。