3

I have some kind of this file

...some other block above also with a { block }

Main:   Subroutine( )
{ <--
    Include(foo = bar )
    Call(foo = bar )
    Repeat(foo = ibar )
    {
        Message("Message = bar number {ibar}" foo )
        Something( )
        Message("Message = foo {bar}" )
    }
    Message("Message = again  {iterations}" )
    For(start = foo , end = bar  )
    {
        Comment( )
    }
    While(foo )
    {
        Comment( )
    }
    Comment( )
} <--
... some other block below also with a { block }

I need to match everything between the parent brackets marked with <-- and I came up with this

/^Main:\s*\w*\(\s*\)\s*\{\s*((?:.*\s*)*?)\}$/gm

but it stops after the } of the first nested block and I cant figure out how to reach the last bracket.

Is there any way to match until the curly bracket right in front of a new line?

Thanks!

Edit: Maybe I should add, that n nested { } blocks are possible

shiiboun
  • 63
  • 5
  • 1
    JS does not support recursive Regular expressions, which you would need to resolve the nested `{ ... }`. Do this manually, find `Main:` then find the first `{` and then start counting `{` and `}` untill the number of opening and closing brackets is equal. There you have your end. – Thomas Nov 19 '19 at 10:08
  • @Thomas can you maybe provide a small code snippet? – shiiboun Nov 19 '19 at 10:30
  • Maybe you can use [XRegExp Api](http://xregexp.com/api/#matchRecursive): `XRegExp.matchRecursive(str, '{', '}', 'g'...` – bobble bubble Nov 19 '19 at 13:19

4 Answers4

1

Many regex implementations don't allow the user to recursively match nested groups. Javascript does not provide the PCRE recursive parameter (?R) see here.

Write a small parser instead.

Edd
  • 42
  • 4
  • It looks like you're writing a compiler of some kind? Take a look at [this wikipedia article](https://en.wikipedia.org/wiki/Recursive_descent_parser). – Edd Nov 19 '19 at 10:29
  • I guess you are right. I thought I could avoid writing a parser. Thanks – shiiboun Nov 19 '19 at 12:01
1

If you want to get content between curly braces, then is it possible to use split method:

const str = `Main:   Subroutine( )
{
    Include(foo = bar )
    Call(foo = bar )
    Repeat(foo = ibar )
    {
        Message("Message = bar number {ibar}" foo )
        Something( )
        Message("Message = foo {bar}" )
    }
    Message("Message = again  {iterations}" )
    For(start = foo , end = bar  )
    {
        Comment( )
    }
    While(foo )
    {
        Comment( )
    }
    Comment( )
} `

const result = str.split(/[{}]+/)
console.log(result);

UPDATE 1:

I've added some data to make a sample data more complicated.

You can get find start index of desired word and then make a substring to extract necessary data:

const str = `Main 1 Main:   Subroutine( )
{
Include(foo = bar )
Call(foo = bar )
Repeat(foo = ibar )
{
    Message("Message = bar number {ibar}" foo )
    Something( )
    Message("Message = foo {bar}" )
}
Message("Message = again  {iterations}" )
For(start = foo , end = bar  )
{
    Comment( )
}
While(foo )
{
    Comment( )
}
Comment( )
} `

const strToFind = `Main:   Subroutine( )`;
const preparedString = str.substring(str.indexOf(strToFind));

const result = preparedString.split(/[{}]+/)
console.log(result);
StepUp
  • 36,391
  • 15
  • 88
  • 148
1

Nested constructs are a pain for regex, it is usually preferable to use or build some parser to proceed to such tasks.

That being said, the case here looks simple enough to allow a match with some simple regex.

I'll use something like ^Main:\s*\w*\(\s*\)\s*\{ <--[^}]*(?:\}(?! <--)[^}]*)*\} <--$.

Key points:

  • \{ <-- match an opening curly brace followed by desired marker.
  • [^}]* match any non-closing curly brace.
  • (?: begin non capturing match,
    • \} a closing curly brace,
    • (?! <--) not followed by the marker,
    • [^}]*) continue to match any non-closing curly brace.
  • \} <-- finally match marked closing curly brace.
PJProudhon
  • 835
  • 15
  • 17
0

Try this:

var myString = "Message = {foo} number {bar}"
var reg = /(?<=\{)\w*(?=\})/g
var myArray = [...myString.matchAll(reg)]
console.log(myArray)
// [['foo'],[bar]]
latanoel
  • 46
  • 1
  • 4