0

I am looking for a specific javascript regex without the new lookahead/lookbehind features of Javascript 2018 that allows me to select text between two asterisk signs but ignores escaped characters.

In the following example only the text "test" and the included escaped characters are supposed to be selected according the rules above: \*jdjdjdfdf*test*dfsdf\*adfasdasdasd*test**test\**sd* (Selected: "test", "test", "test\*")

During my research I found this solution Regex, everything between two characters except escaped characters /(?<!\\)(%.*?(?<!\\)%)/ but it uses negative lookbehinds which is supported in javascript 2018 but I need to support IE11 as well, so this solution doesn't work for me.

Then i found another approach which is almost getting there for me here: Javascript: negative lookbehind equivalent?. I altered the answer of Kamil Szot to fit my needs: ((?!([\\])).|^)(\*.*?((?!([\\])).|^)\*) Unfortuantely it doesn't work when two asterisks ** are in a row.

I have already invested a lot of hours and can't seem to get it right, any help is appreciated!

An example with what i have so far is here: https://www.regexpal.com/?fam=117350

I need to use the regexp in a string.replace call (str.replace(regexp|substr, newSubStr|function); so that I can wrap the found strings with a span element of a specific class.

marvimarvv
  • 13
  • 3

2 Answers2

0

You can use this regular expression:

(?:\\.|[^*])*\*((?:\\.|[^*])*)\*

Your code should then only take the (only) capture group of each match.

Like this:

var str = "\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*";
var regex = /(?:\\.|[^*])*\*((?:\\.|[^*])*)\*/g

var match;
while (match = regex.exec(str)) {
    console.log(match[1]);
}

If you need to replace the matches, for instance to wrap the matches in a span tag while also dropping the asterisks, then use two capture groups:

var str = "\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*";
var regex = /((?:\\.|[^*])*)\*((?:\\.|[^*])*)\*/g

var result = str.replace(regex, "$1<span>$2</span>");
console.log(result);

One thing to be careful with: when you use string literals in JavaScript tests, escape the backslash (with another backslash). If you don't do that, the string actually will not have a backslash! To really get the backslash in the in-memory string, you need to escape the backslash.

trincot
  • 317,000
  • 35
  • 244
  • 286
  • Hey trincot, thanks for the answer but the characters that are escaped only use one backslash \. I tried to change it but then the solution weirdly doesn't work for me. Also i wasn't precise enough i realize. I need to use the regex in a string.replace call so I'm not sure if the group match solution would work. – marvimarvv Jul 06 '20 at 14:41
  • That is a misunderstanding. Also in my code there is really only one backslash. But to tell JavaScript it is a backslash you need to escape it. When you print it with `console.log(str)` you'll see there is only one backslash each time. If you print `console.log("\*")` however, you'll see there is only the asterisk. You just escaped the asterisk! It is a string with one character only. I added a `string.replace` example to my answer. BTW: you can check the link to regex101, which does not need that escaping (because the data is not a JavaScript string): so you can convince yourself of the regex – trincot Jul 06 '20 at 14:43
  • If you can't make it work, then update your question and bring it closer to your actual situation. – trincot Jul 06 '20 at 14:45
  • 1
    Thanks a lot! That was super helpful and i got it working! :) – marvimarvv Jul 06 '20 at 14:52
0
const testStr = `\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*`;
const m = testStr.match(/\*(\\.)*t(\\.)*e(\\.)*s(\\.)*t(\\.)*\*/g).map(m => m.substr(1, m.length-2));
console.log(m);

More generic code:

const prepareRegExp = (word, delimiter = '\\*') => {
  const escaped = '(\\\\.)*';
  return new RegExp([
    delimiter,
    escaped,
    [...word].join(escaped),
    escaped,
    delimiter
  ].join``, 'g');
};

const testStr = `\\*jdjdjdfdf*test*dfsdf\\*adfasdasdasd*test**test\\**sd*`;
const m = testStr
    .match(prepareRegExp('test'))
    .map(m => m.substr(1, m.length-2));

console.log(m);

https://instacode.dev/#Y29uc3QgcHJlcGFyZVJlZ0V4cCA9ICh3b3JkLCBkZWxpbWl0ZXIgPSAnXFwqJykgPT4gewogIGNvbnN0IGVzY2FwZWQgPSAnKFxcXFwuKSonOwogIHJldHVybiBuZXcgUmVnRXhwKFsKICAgIGRlbGltaXRlciwKICAgIGVzY2FwZWQsCiAgICBbLi4ud29yZF0uam9pbihlc2NhcGVkKSwKICAgIGVzY2FwZWQsCiAgICBkZWxpbWl0ZXIKICBdLmpvaW5gYCwgJ2cnKTsKfTsKCmNvbnN0IHRlc3RTdHIgPSBgXFwqamRqZGpkZmRmKnRlc3QqZGZzZGZcXCphZGZhc2Rhc2Rhc2QqdGVzdCoqdGVzdFxcKipzZCpgOwpjb25zdCBtID0gdGVzdFN0cgoJLm1hdGNoKHByZXBhcmVSZWdFeHAoJ3Rlc3QnKSkKCS5tYXAobSA9PiBtLnN1YnN0cigxLCBtLmxlbmd0aC0yKSk7Cgpjb25zb2xlLmxvZyhtKTs=

gkucmierz
  • 1,055
  • 1
  • 9
  • 26
  • I don't think the OP is asking to literally match the letters from "test". It could be another string. That was just an example. – trincot Jul 06 '20 at 14:21
  • Correct, I added generic solution too. – gkucmierz Jul 06 '20 at 14:26
  • 1
    Hey @gkucmierz thanks fot the answer but I think it is too far off for me, I should have been more precise. I want to use the regex in a string.replace call – marvimarvv Jul 06 '20 at 14:39
  • 1
    @gkucmierz, I think you misunderstood the question. It is not about matching a sequence of letters. – trincot Jul 06 '20 at 14:58