1

So I got a string:

let string = "XABXAX12345BX293993AX9393B33AXAXBXBXBXAAABBX";

and I'd like to extract all occurrences between the strings AX and BXs to get an array like this as result:

let result = ["12345", "9393B33AXAX"];

I've tried to use some kind of regex but I was not really successfull tbh.

let result = string.split(/AX([^AXBX]+)BX/);

Another aproach was a simple for-loop but also this is not working as I've expected. So maybe somebody is able to help me fixing the issues. Please have a look at my code:

let string = "XABXAX12345BX293993AX9393B33AXAXBXBXBXAAABBX"

let result = [];
for (let i=0; i<string.length; i++) {
  if (string[i] == "A" && string[i+1] === "X") {
    for (let j=i; j<string.length; j++) {
      if (string[j] == "B" && string[j+1] === "X") {
        let substring = string.substring(i+1, j+1);
        result.push(substring)
        break;
      }
    }
  }
}

console.log(result);
  • I would say that you are extracting all not null occurrences so that the string AXBX can be safely ignored – Jorge Lavín Jan 06 '19 at 20:56
  • Just like your last question except now it's `AX` and `BX` –  Jan 06 '19 at 20:58
  • `AX((?:(?![AB]X)[\S\s])+)BX` –  Jan 06 '19 at 20:59
  • 2
    Shouldn't be the second string `9393B33AXAX`? – Jorge Lavín Jan 06 '19 at 21:02
  • 1
    It not clear why you should get `9393` in the result. It's not between `AX` and `BX`. – Mark Jan 06 '19 at 21:03
  • If you care only about numbers in betweens you can split by digits and then filter out everything else: `string.split(/(\d+)/);` – NickHTTPS Jan 06 '19 at 21:03
  • Looks like a working solution - **but is there any universell solution given two strings to build the regex from?** I'm not able to build this regex string myself with other strings instead of "AX" and "BX". –  Jan 06 '19 at 21:05
  • I commented a universal solution. –  Jan 06 '19 at 21:06
  • So how to use this given a three character long string like "abcd" and "efgh? Isn't it `abcd((?:(?![ae]X)[\S\s])+)efgh`? –  Jan 06 '19 at 21:07
  • @MarkMeyer Yeah you are right - but in my for loop the last element ("XB") is wrong. –  Jan 06 '19 at 21:08
  • `string1((?:(?!string1|string2)[\S\s])+)string2` –  Jan 06 '19 at 21:10
  • `AX(.+?)BX` will create the result you want. – Herohtar Jan 06 '19 at 21:10
  • Stuff like `AX(.+?)BX` will match 098`AXAXAXAXSX98qqABBEZSAXAXBX`098 –  Jan 06 '19 at 21:13
  • Much thanks for your help guys. But please note that this code below is not working using the `"string1((?:(?!string1|string2)[\S\s])+)string2"`-regex: `let detectorStart = "3737", detectorEnd = "1717"; let r = \`/${detectorStart}((?:(?!${detectorStart}|${detectorEnd})[\S\s])+)${detectorEnd}/g\`; \n\nlet result = "0103737254145146162476316412434641717".match(r)`. –  Jan 06 '19 at 21:19

2 Answers2

2

Here's a simple solution:

function re_esc(str) {
    return str.replace(/\W/g, "\\$&");
}

const start = "AX";
const end = "BX";

const re = new RegExp(re_esc(start) + '([\\s\\S]*?)' + re_esc(end), 'g');

const string = "XABXAX12345BX293993AX9393B33AXAXBXBXBXAAABBX";
const results = [];

let m;
while (m = re.exec(string)) {
    results.push(m[1]);
}

console.log(results);

We build a regex of the form START(.*?)END, then use it to successively extract matches in a loop.

melpomene
  • 84,125
  • 8
  • 85
  • 148
  • 1
    Side note: This would be much less annoying in the original Perl: `my @results; while ($string =~ m{ \Q$start\E (.*?) \Q$end\E }xsg) { push @results, $1; }` – melpomene Jan 06 '19 at 21:43
1

Here's a relatively straightforward looping approach that doesn't use regexes:

function findOccurrences(str, fromStr, toStr) {
  const occurrences = [];
  let startIndex = 0;
  
  while (true) {
    const fromIndex = str.indexOf(fromStr, startIndex);
    if (fromIndex === -1) {
      break;
    }

    const toIndex = str.indexOf(toStr, fromIndex + fromStr.length);
    if (toIndex === -1) {
      break;
    }

    const occurrence = str.slice(fromIndex + fromStr.length, toIndex);
    occurrences.push(occurrence);
    startIndex = toIndex + toStr.length;
  }

  return occurrences;
}

console.log(
  findOccurrences("XABXAX12345BX293993AX9393B33AXAXBXBXBXAAABBX",
    "AX", "BX"));

This doesn't include any sanity checks; for instance, you might want to check that fromStr and toStr aren't empty strings.

Kevin Ji
  • 10,479
  • 4
  • 40
  • 63