25

Example string: $${a},{s$${d}$$}$$

I'd like to match $${d}$$ first and replace it some text so the string would become $${a},{sd}$$, then $${a},{sd}$$ will be matched.

Marcel Korpel
  • 21,536
  • 6
  • 60
  • 80
Ozgur
  • 1,744
  • 4
  • 17
  • 24
  • 2
    Couldn't you just use two separate regular expressions? Match #1 first, replace, and then try to match # 2? – Pandincus Dec 11 '10 at 00:10
  • @Pandincus: That would not allow nesting, otherwise yes. – Orbling Dec 11 '10 at 00:17
  • For anyone coming here hoping to solve a recursive problem with regular expressions, something like https://pegjs.org/ may actually be more helpful. For instance, rules like `var = "$${" name "}$$"` would allow you to build a data structure that mimics the AST. At the end of the day, as simple as this is, it's truthfully a programming language, and don't be afraid to use the right tools for the job! – btown Apr 05 '20 at 22:43

6 Answers6

41

Annoyingly, Javascript does not provide the PCRE recursive parameter (?R), so it is far from easy to deal with the nested issue. It can be done however.

I won't reproduce code, but if you check out Steve Levithan's blog, he has some good articles on the subject. He should do, he is probably the leading authority on RegExp in JS. He wrote XRegExp, which replaces most of the PCRE bits that are missing, there is even a Match Recursive plugin!

Orbling
  • 20,413
  • 3
  • 53
  • 64
  • 1
    I wouldn’t say that XRegExp replaces ‘most of the parts that are missing’, but it ***does*** help. For real regexes, though, you need full property and grapheme support. More than 80% of the web is Unicode now, and it’s a crime that you can’t cope with it in the browser. – tchrist Feb 23 '12 at 02:31
  • @tchrist: The English-speaking world barely uses it, so it is therefore unimportant to the people who could change it. That added on to the principle of impossibly slow change in the base level of the web makes such things still a way off. Inconvenient to say the least. – Orbling Feb 23 '12 at 14:58
  • 2
    @Orbling The English-speaking very much ***does use Unicode***, and a great deal‼ See [this analysis of one large English corpus](http://stackoverflow.com/questions/5567249/what-are-the-most-common-non-bmp-unicode-characters-in-actual-use). I’ve done others since then. Most web pages are in Unicode—you merely do not realize it. You cannot write English properly without it: no curly quotes, no £10 note, no 5¢ piece, &c&c. The web has seen **a meteoric *800% growth* in Unicode** over the past 5 years. That is fast change, not slow‼ People aren’t paying attention, but Unicode is here nonetheless. – tchrist Feb 23 '12 at 15:30
  • @tchrist: Yes, people do use Unicode, because they should. It is totally not needed for those examples you give, they are all in most western code spaces. Usually [IEC 8859-1](http://en.wikipedia.org/wiki/ISO/IEC_8859-1) is used on European websites, which is 8-bit extended ASCII of a sort. The cent symbol is available as 162=¢ and the pound as 163=£ (as a UK resident, the pound is also on my keyboard and has been so for a lot longer than Unicode has been present). All a matter of codepages, which most webpages still support. – Orbling Feb 23 '12 at 16:00
  • 3
    @tchrist: UTF8/16 are increasingly output as standard, because the webservers and editors are adopting it as default. Curly quotes are awful things anyhow, anathema to programmers. ;-) – Orbling Feb 23 '12 at 16:01
  • 2
    @Orbling: No, UTF-8 only, not UTF-16. Nobody does webpages in UTF-16: that's dumb. UTF-16 has all the disadvantages of both UTF-8 and UTF-32, but enjoys none of the benefits of either. UTF-16 is a sorry legacy. – tchrist Feb 23 '12 at 16:41
  • @tchrist: Sorry, should have been clearer - UTF8 for webpages, a number of editors still use UTF16 when in a Unicode mode. – Orbling Feb 23 '12 at 17:26
4

I wrote this myself:

String.prototype.replacerec = function (pattern, what) {
    var newstr = this.replace(pattern, what);
    if (newstr == this)
        return newstr;
    return newstr.replace(pattern, what);
};

Usage:

"My text".replacerec(/pattern/g,"what");

P.S: As suggested by @lededje, when using this function in production it's good to have a limiting counter to avoid stack overflow.

Akash Budhia
  • 446
  • 3
  • 11
  • 2
    I used it in a production code running for over an year. It a rare chance that a regex keeps on matching for infinite times. So no overflow! And that's a quick way to have recursive replace straight from JavaScript code. – Akash Budhia Jun 04 '13 at 07:22
  • 1
    The stack's limit is not infinity. IE6 can only handle 1130 calls. That's not 1130 regexp matches, it's total regexp matches plus whatever else you have going on. Saying this is a good enough answer is not correct because someone could be using it in an already function intensive environment, and something that shouldn't be adding to the stack could push it to overflow. so -1. – lededje Jun 04 '13 at 10:15
  • 3
    This can't recurse infinitely... there's no recursion? – Patrick Roberts May 03 '18 at 06:09
  • 2
    I believe the line# 5 {return newstr.replace(pattern, what);} is supposed to be {return newstr.replacerec(pattern, what);} to obtain recursion. (add "rec" at the end of "replace"). Agree? – Marcelo Finki Jul 19 '19 at 10:02
  • 1
    There's no need for recursion. String.prototype.replacerec = function (pattern, what) { var prev = null; while (prev !== what) { prev = what; what = this.replace(pattern, what); } return what; }; – Whatabrain Jan 18 '22 at 15:42
0

Since you want to do this recursively, you are probably best off doing multiple matches using a loop.

Regex itself is not well suited for recursive-anything.

BlueRaja - Danny Pflughoeft
  • 84,206
  • 33
  • 197
  • 283
0
var content = "your string content";
var found = true;
while (found) {
    found = false;
    content = content.replace(/regex/, () => { found = true; return "new value"; });
}
  • Although the concepts are perhaps there, there's so much that won't work, can go very wrong, and doesn't address the question asked. – Matt Fletcher Dec 12 '17 at 17:52
  • What can go wrong? This simple pattern can solve the problem in the question with the right regex definition. – Burak Büyükatlı Dec 12 '17 at 18:42
  • It has no fallback, so if it doesn't match, it will probably exceed memory allowance. Also what is "new value" and where should it come from? And you're not showing how OP's regex could actually work in this code. – Matt Fletcher Dec 12 '17 at 18:44
  • You are wrong. If there is no more matched value 'found' stays false and the while loop exits. "new value" is the new value for matched string. – Burak Büyükatlı Dec 12 '17 at 18:49
0

you can try \$\${([^\$]*)}\$\$, the [^\$] mean do not capture if captured group contains $

var re = new RegExp(/\$\${([^\$]*)}\$\$/, 'g'),
  original = '$${a},{s$${d}$$}$$',
  result = original.replace(re, "$1");
  
console.log('original: ' + original)
console.log('result: ' + result);
uingtea
  • 6,002
  • 2
  • 26
  • 40
-2

In general, Regexps are not well suited for that kind of problem. It's better to use state machine.

Vojta
  • 23,061
  • 5
  • 49
  • 46