6

My string is: (as(dh(kshd)kj)ad)... ()()

How is it possible to count the parentheses with a regular expression? I would like to select the string which begins at the first opening bracket and ends before the ...

Applying that to the above example, that means I would like to get this string: (as(dh(kshd)kj)ad)

I tried to write it, but this doesn't work:

var str = "(as(dh(kshd)kj)ad)... ()()";
document.write(str.match(/(.*)/m));
mwilson
  • 1,255
  • 12
  • 17
don kaka
  • 1,179
  • 2
  • 12
  • 28
  • 1
    This is possible only up to a fixed level. That is: you can choose to support up to, say, 5 levels of nesting. – acdcjunior Sep 09 '13 at 17:01

6 Answers6

12

As I said in the comments, contrary to popular belief (don't believe everything people say) matching nested brackets is possible with regex.

The downside of using it is that you can only do it up to a fixed level of nesting. And for every additional level you wish to support, your regex will be bigger and bigger.

But don't take my word for it. Let me show you. The regex \([^()]*\) matches one level. For up to two levels see the regex here. To match your case, you'd need:

\(([^()]*|\(([^()]*|\([^()]*\))*\))*\)

It would match the bold part: (as(dh(kshd)kj)ad)... ()()

Check the DEMO HERE and see what I mean by fixed level of nesting.

And so on. To keep adding levels, all you have to do is change the last [^()]* part to ([^()]*|\([^()]*\))* (check three levels here). As I said, it will get bigger and bigger.

acdcjunior
  • 132,397
  • 37
  • 331
  • 304
4

See Tim's answer for why this won't work, but here's a function that'll do what you're after instead.

function getFirstBracket(str){
  var pos = str.indexOf("("),
      bracket = 0;

  if(pos===-1) return false;

  for(var x=pos; x<str.length; x++){
    var char = str.substr(x, 1);    
    bracket = bracket + (char=="(" ? 1 : (char==")" ? -1 : 0));
    if(bracket==0) return str.substr(pos, (x+1)-pos);
  }
  return false;
}

getFirstBracket("(as(dh(kshd)kj)ad)... ()(");
Doug
  • 3,312
  • 1
  • 24
  • 31
  • 1
    +1, This works great if the string always starts with an opening paren, needs some work if it should give the first paren group for a string like `"foo(bar)baz"`. – Andrew Clark Sep 09 '13 at 17:05
  • True enough! I'm unsure if that's really needed, but I've updated it so it can happily cope with that. – Doug Sep 09 '13 at 17:15
4

There is a possibility and your approach was quite good: Match will give you an array if you had some hits, if so you can look up the array length.

var str = "(as(dh(kshd)kj)ad)... ()()",
    match = str.match(new RegExp('.*?(?:\\(|\\)).*?', 'g')),
    count = match ? match.length : 0;

This regular expression will get all parts of your text that include round brackets. See http://gskinner.com/RegExr/ for a nice online regex tester.

Now you can use count for all brackets. match will deliver a array that looks like:

["(", "as(", "dh(", "kshd)", "kj)", "ad)", "... (", ")", "(", ")"]

Now you can start sorting your results:

var newStr = '', open = 0, close = 0;

for (var n = 0, m = match.length; n < m; n++) {
    if (match[n].indexOf('(') !== -1) {
        open++;
        newStr += match[n];
    } else {
        if (open > close) newStr += match[n];
        close++;
    }
    if (open === close) break;
}

... and newStr will be (as(dh(kshd)kj)ad)

This is probably not the nicest code but it will make it easier to understand what you're doing.

With this approach there is no limit of nesting levels.

jsmorph
  • 221
  • 1
  • 5
3

This is not possible with a JavaScript regex. Generally, regular expressions can't handle arbitrary nesting because that can no longer be described by a regular language.

Several modern regex flavors do have extensions that allow for recursive matching (like PHP, Perl or .NET), but JavaScript is not among them.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • Even though you're right, if you'll check the example that he posted you'll see that it's doable using regex (the title is misleading). – Nir Alfasi Sep 09 '13 at 17:04
  • @alfasin: I don't think those `...` are meant as a delimiter. But you're right, from the way the question is written, one could get that idea. – Tim Pietzcker Sep 09 '13 at 17:05
2

No. Regular expressions express regular languages. Finite automatons (FA) are the machines which recognise regular language. A FA is, as its name implies, finite in memory. With a finite memory, the FA can not remember an arbitrary number of parentheses - a feature which is needed in order to do what you want.

I suggest you use an algorithms involving an enumerator in order to solve your problem.

Pétur Ingi Egilsson
  • 4,368
  • 5
  • 44
  • 72
  • 1
    But in practice regex patterns can *never* be infinite (because the computers that hold them have finite memory). Therefore, the number of parentheses is *always* finite. – étale-cohomology Apr 04 '18 at 09:05
1

try this jsfiddle

var str = "(as(dh(kshd)kj)ad)... ()()";
document.write(str.match(/\((.*?)\.\.\./m)[1] );
Anoop
  • 23,044
  • 10
  • 62
  • 76
  • 2
    I'm pretty sure that the string he gave was an example string and that the ellipsis symbolizes missing content rather than something that can be relied upon. – Doug Sep 09 '13 at 17:04
  • @Dolondro the title is misleading - a counter is not relevant, +1 for good answer! – Nir Alfasi Sep 09 '13 at 17:05
  • @Dolondro I read op question multiple times tried. I think he want to get string from specific character to other specific character – Anoop Sep 09 '13 at 17:07