0

Here is my string:

((A_1:2,B:3):2.1,C_3:1.2,(D:3,(E:4.3,F:2):3.1,):G:1.7);

I need to be able to capture any and all sets of parentheses pairs e.g.

(A_1:2,B:3)

and

(D:3,(E:4.3,F:2):3.1,)

This:

\([^ ]+\)

will capture the entire string, but I can't find a combination of group systems that will capture at least 10 levels of nested parentheses pairs.

I hope to be able to put the captured groups into a data structure so I can parse it more easily. But first, I need to capture the pairs.

Glubbdrubb
  • 357
  • 1
  • 2
  • 13

2 Answers2

1

I think you don't need regex for this event more it's harder with.
Here's what i came up with:
Asuming your target datatype is array of stings

def split(data):
    temp = ""
    data = data[1:-2]
    array = []
    closed = opened = 0
    for letter in data:
        if letter == '(':
            opened += 1
        elif letter == ')':
            closed += 1

        if opened != 0:
            temp += letter
            if opened == closed:
                array.append(temp)
                temp = ""
                opened = 0
                closed = 0
    return array

print(split("((A_1:2,B:3):2.1,C_3:1.2,(D:3,(E:4.3,F:2):3.1,):G:1.7);"))
Maciej Kozieja
  • 1,812
  • 1
  • 13
  • 32
  • I can work with an array of strings. This results in: ['(A_1:2,B:3)', '(D:3,(E:4.3,F:2):3.1,)'] (D:3,(E:4.3,F:2):3.1,) still needs to be split off. I assume if I want to go another level, I just run the function again, but on the second element in the array? – Glubbdrubb Feb 11 '17 at 16:13
  • So is it fine ? I can change it but you have to specify output. – Maciej Kozieja Feb 11 '17 at 16:14
  • @Glubbdrubb yes it would work but then you have to remove this `data = data[1:-2]` and use is at firs data because for this to work input must be `(A_1:2,B:3):2.1,C_3:1.2,(D:3,(E:4.3,F:2):3.1,):G:1.7` so without first brackets thats done by `data = data[1:-2]` if you want to use this function recursivly then you have to remove this fragment – Maciej Kozieja Feb 11 '17 at 16:25
  • That's great! :D Would be nice if you accept answer :) – Maciej Kozieja Feb 11 '17 at 16:32
0

You cannot do that with regexp matching. The reason is because the language of all matching parenthesized strings is not regular. Look at some bibliography of pattern matching and finite automata, and you'll find the mathematical reasoning behind this.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31