8

I am using regex to replace ( in other regexes (or regexs?) with (?: to turn them into non-matching groups. My expression assumes that no (?X structures are used and looks like this:

(
  [^\\]     - Not backslash character
  |^        - Or string beginning
)
(?:
  [\(]      - a bracket
)

Unfortunatelly this doesn't work in case that there are two matches next to each other, like in this case: how((\s+can|\s+do)(\s+i)?)?

image description

With lookbehinds, the solution is easy:

/(?<=[^\\]|^)[\(]/g

But javascript doesn't support lookbehinds, so what can I do? My searches didn't bring any easy universal lookbehind alternative.

Tomáš Zato
  • 50,171
  • 52
  • 268
  • 778

3 Answers3

2

Use lookbehind through reversal:

function revStr(str) {
    return str.split('').reverse().join('');
}

var rx = /[(](?=[^\\]|$)/g;
var subst = ":?(";

var data = "how((\\s+can|\\s+do)(\\s+i)?)?";
var res = revStr(revStr(data).replace(rx, subst)); 
document.getElementById("res").value = res;
<input id="res" />

Note that the regex pattern is also reversed so that we could use a look-ahead instead of a look-behind, and the substitution string is reversed, too. It becomes too tricky with longer regexps, but in this case, it is still not that unreadable.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
2

One option is to do a two-pass replacement, with a token (I like unicode for this, as it's unlikely to appear elsewhere):

var s = 'how((\\s+can|\\s+do)(\\s+i)?)?';
var token = "\u1234";
// Look for the character preceding the ( you want
// to replace. We'll add the token after it.
var patt1 = /([^\\])(?=\()/g;
// The second pattern looks for the token and the (.
// We'll replace both with the desired string.
var patt2 = new RegExp(token + '\\(', 'g');

s = s.replace(patt1, "$1" + token).replace(patt2, "(?:");

console.log(s);

https://jsfiddle.net/48e75wqz/1/

nrabinowitz
  • 55,314
  • 10
  • 149
  • 165
  • 1
    Good workaround, too +1:) I just checked out of interest: `\u1234 ETHIOPIC SYLLABLE SEE`. – Wiktor Stribiżew Jul 31 '15 at 21:28
  • There's a great writeup on this kind of [Mimicing Lookbehind in JavaScript](http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript) for further reading. – Adam Katz Jul 31 '15 at 21:33
  • 1
    As David Conrad pointed out to me in [an answer of mine](http://stackoverflow.com/a/26791459/17300), don't use an _"unlikely"_ character like `\u1234` — use a unicode code point that _is **not** a character_ — I now use `\uFDD1` — from the unicode [private use area](http://www.unicode.org/faq/private_use.html#nonchar4). – Stephen P Jul 31 '15 at 22:15
0

(EDITED)

string example:

how((\s+can|\s+do)(\s+i)?)?

one line solution:

o='how((\\s+can|\\s+do)(\\s+i)?)?';
alert(o.replace(/\\\(/g,9e9).replace(/\(/g,'(?:').replace(/90{9}/g,'\\('))

result:

how(?:(?:\s+can|\s+do)(?:\s+i)?)?

and of course it works with strings like how((\s+\(can\)|\s+do)(\s+i)?)?

AwokeKnowing
  • 7,728
  • 9
  • 36
  • 47