2

i have to extract parts of a string, actually splitting it where there are spaces. But because there might also be spaces inside the parts i want to extract, i came upon a regex for them to be ignored, actually when those spaces are between brackets. Note that i don't fully understand the alternatives in regex, made a lot of tests, and i manage it with one bracket level (first log in the example). Also brackets might not be there, so i came upon the last alternative (|[^\s]+) to get things like tag1 too.

After a lot of (not working) tests, i came upon the second regexp, which consists in the first alternative from the first regexp, modified to recognize the second level of nesting, followed the whole first regexp as a second alternative.

This is working fine (so far as there is not a third nesting level, see the example), but i have a feeling there should be an easier solution, as the pattern seems to be recursive (new nesting level + whole last level regexp).

Is there a way to solve this in a more general way (maybe not infinite nesting level, but let's say 4 or 5 deep?). Maybe with recursive regexp?

var str = "tag1 tag2 func(foo) func2(foo, bar) func1(func2(foo), bar, func2(bar)) func1(func2(foo, func1(foo)), bar)";

console.log( str.match(/([^\s]*\([^()]+\)[^\s]*|[^\s]+)/g) );

console.log( str.match(/([^\s]*\((?:[^()]*\([^()]+\)[^()]*)+\)[^\s]*|(?:[^\s]*\([^()]+\)[^\s]*|[^\s]+))/g) );

Edit: I'm not picky about being tagged as duplicate, but note that i searched a lot about this problem (match a pattern, excluding when the pattern is between certain chars, at an finite nesting level). My question is specific to that, recursive regex was only a suggestion to solve it, not the main part. Actually the topic tagged as duplicate does not help me in any way..

Kaddath
  • 5,933
  • 1
  • 9
  • 23
  • Side note: sadly, JS regex engine doesn't support [regex recursion](http://www.regular-expressions.info/recurse.html). – sp00m Feb 28 '17 at 13:08
  • @sp00m gasp, i guess that answers a crucial point of the question, but maybe someone has a better solution than mine for things like 4 level deep.. or i will probably do a function to build the regex string depending on level, but it will be a long string.. – Kaddath Feb 28 '17 at 13:13
  • 1
    I was going to answer this to tell you *not* to try contrived solutions and to just write a really simple "parser" yourself... Anyway it's closed, but have [the code](http://pastebin.com/Ct9VUxSj) since I already wrote the stuff. – Lucas Trzesniewski Feb 28 '17 at 13:28
  • thanks a lot for these answers, that does help! – Kaddath Feb 28 '17 at 13:32
  • 1
    for up to 3 levels such as [`[^)(\s]+\((?:[^)(]+|\((?:[^)(]+|\((?:[^)(]+|\([^)(]*\))*\))*\))*\)|[^)(\s]+`](https://www.regex101.com/r/sTBnd8/1) (you can easily add levels by breaking the the relevant part apart [like this demo](https://www.regex101.com/r/dFH8mg/1) and rejoin for JS regex). – bobble bubble Feb 28 '17 at 17:10

0 Answers0