I am trying to write a regular expression to match ruby's list and hash syntaxes, e.g.:
[:a, "b", c, 3]
{:a => [
1,2,3
]}
[1, {
a => "t", :b => "w",
:c => :o
}, 3]
The issue, of course, is with the nested/recursive nature of the things. I have a sneaking suspicion that such nested structures cannot actually be expressed as a regular expression as that 'language' is not regular. I expect the solution would have to involve subroutines and recursion, however I'm struggling to get my head around it. Can anyone confirm/deny my suspicions or offer a solution?
Any help appreciated.
Edit: as a note, I'm using PHP's preg_*
methods mainly
Edit: as another note, I've created a routine, <ruby_value>
to match keys and scalar values.
Edit: I should specify that I'm more interested in this "out of interest". I have already wrote a mini-parser for these things in PHP however I am interested to see if a not-necessarily-pure-regex solution exists.
E.g. equal nested brackets:
/^(?<paren_expr>
\( (?: (?&paren_expr) | ) \)
)$/x
Which is a valid PHP regex and will match "(())", "()" and "((((((()))))))" but not "(" or "(()" etc.